My post earlier this year on “Bad Data is Worse than No Data” did not address what is likely a larger problem when it comes to auctioning data sets. Under cover of ‘best intentions’, misreported data is misinformation.
An example: I ride the train from the Connecticut suburbs to Manhattan most weekdays. Sometimes I do not buy a monthly ticket and buy a multi-trip or single ticket. The conductor asks to see everyone’s ticket. Recently when displaying a non-monthly ticket, a few conductors ask to scan the QR code. But not all of them do it and then and some do it only occasionally. When you ride the train a lot you do recognize the conductors on various trains.
Why does the MTA scan tickets? Data collection of course! At least I think. Actually I really have no idea. Some people display printed tickets that they hand to the conductor, others have the app. The result is the data is terribly incomplete, disparate and seemingly of little value. Even if the MTA were to report that their data collection only represents 20% of the riding public, surely they cannot make decisions based on these random data?
Often when I depart the train in Grand Central Station there are two MTA workers standing back to back with counters clicking away at (as accurately as they can) all the departing passengers. I am fairly certain that the MTA is not overlaying the counter data with the few scanned tickets data from the mobile app. If every ticket from every rider was scanned or recorded with NFC (Near Field communication) and entered into a database, then there’d be some juicy data to review – Actionable data! That’s decidedly not the case today.
Because commuter train travel has been around for a very long time in the U.S. and most systems are antiquated whether it’s equipment or the tracks themselves. In order for a data collection system to be reliable (more or less) on the MTA a massive change would need to take place. When I’ve traveled on trains in Asia one cannot board the train in many large stations without a ticket. It’s hard for me to believe sometimes that conductors still wander up and down the aisles checking ticket and punching holes in tickets. I find most of the conductors to be pleasant and informed enough to offer concise answers to any travel questions about times and arrivals and changes. If they did not have to punch tickets what else could they do? How many conductors are actually needed on a 10-car commuter train? There are two on most trains now.
The data pulled from scans, and clicks, and transactions are varied and undoubtedly difficult to combine to get a better read. Is it actionable? I doubt its reliability and for that reason I think not. But that probably won’t stop the MTA from acting on what they do have. It’s better than nothing right? Not.
Is this any way to run a railroad?