Skip to main content

Posts

Showing posts from June, 2019

Handling double-quoted CSVs in Azure Data Factory Pipelines

Azure Data Factory by default uses a backslash as the escape character for CSVs, but we ran into an issue with this today processing one of the CSV files from data.gov.au.  As with most CSVs they use quotes around values as normal and with double-quotes for empty values, but they also use double-quotes to escape quotes within non-empty values. This probably sounds confusing, so here's an example:

"column 1","column 2","","column 4 value is ""sort of"" like this"

The ADF pipeline failed because the double-quotes don't get escaped correctly:
ErrorCode=UserErrorSourceDataContainsMoreColumnsThanDefined, found more columns than expected column count.

The solution was to change the "Escape character" property on the dataset, by clicking the "Edit" button beneath it and manually entering "", since "" isn't in the list of escape characters.  I didn't think this would work at fir…

Handy SOQL query snippets

To count how many people logged in for the last time each year, for only inactive users:

select calendar_year(lastlogindate) LoginYear,count(id) Total
from user
where isactive = false
and lastlogindate <> null
group by calendar_year(lastlogindate)
order by count(id) desc