At the beginning of my personal Dear Data project, I got myself a small notebook for recording the data. Unfortunately only after day or two of recording I found myself with a lot of noise or useless data.
There are three problems which I encountered so far:
- Not granular enough
- Too much granular
- Uncertain guestimates
Not granular enough
In week 4 I was tracking how long after the purchase the item was destroyed (eaten, used until no longer reusable, etc.). There were three categories:
- (almost) immediate destruction
- same day destruction
- survives longer than a day
As you can see it’s not granular enough. There should’ve been a number of days for each thing and a note if the purchased item wasn’t destroyed in the observed timeframe.
Too much granular
There’s a statistics which is currently halted, communication with recruiters, because I don’t have at the moment a tool to record the interactions. I tried to use my bullet journal for it with a simple table which worked in other data collecting adventures.
The problem started when I interacted with someone, then there was a pause for couple of days. Should the communication be treated as a new communication or as continuation of the previous one? If as a continuation, how can I be completely sure that I am connecting the correct sets of data?
For this reason I chose to postpone this data visualization until I can get my hands on some better tool to track the data than my bullet journal.
During a week when I was tracking people around me I could’ve always assumed that there are some people around me but how does one guesses how many people one can and can’t see? No one, as far as I can tell, can stop time count everyone they can see and hear, put a note in their journal and resume normal time.
So every data point is a guestimate unless one is in an empty room or around a small group of people, 30. Even then do people who aren’t in the room but one can hear them, know that they should be near by, count as people near by?
After three days I was certain that I needed to change my methodology from guestimating, people in areas where I currently am, to a more relative measurement, how crowded the area feels. This methodology has a downside, huge crowd for a short period of time will skew the results.