Demo - BlueLabs Analytics

Demo

  • How are dates and times represented?
  • Is there a header row at the top to tell you what the column names are?
  • Is the file compressed? By what tool?
  • Are we using the newline format from Windows, macOS or Linux?
  • How are characters represented? Is this in ASCII? Maybe UTF-8 like macOS and Linux writes out? Or maybe UTF-16 or UTF-8 with a byte-order-marker like some variants of Windows like to add? Or do you have one of those ancient punchcards in EBCDIC?
This matches what a database stores pretty well, right? Columns with names, and rows representing individual data items. The earliest reference Wikipedia has about CSV files dates back to the halcyon days of the early 1970s, when IBM wanted an easier way to enter data via punchcards in the latest version of their FORTRAN language. Up until that point, folks doing data entry would need to line up particular columns of data to particular places on the punchcards —first name in column 12, last name in column 45, SSN is column 60, etc. Error-prone, I’m sure! So instead, people would enter a character between each column of data — e.g., a comma. More data can fit per card, since there’s only one character of wasted space between data items, and it’s still readable by humans.