Skip to main content

Handling double-quoted CSVs in Azure Data Factory Pipelines

Azure Data Factory by default uses a backslash as the escape character for CSVs, but we ran into an issue with this today processing one of the CSV files from data.gov.au.  As with most CSVs they use quotes around values as normal and with double-quotes for empty values, but they also use double-quotes to escape quotes within non-empty values. This probably sounds confusing, so here's an example:

"column 1","column 2","","column 4 value is ""sort of"" like this"

The ADF pipeline failed because the double-quotes don't get escaped correctly:
ErrorCode=UserErrorSourceDataContainsMoreColumnsThanDefined, found more columns than expected column count.

The solution was to change the "Escape character" property on the dataset, by clicking the "Edit" button beneath it and manually entering "", since "" isn't in the list of escape characters.  I didn't think this would work at first but it turns out that escape characters don't have to be a single character, and it looks like the double-quotes used for empty values are processed separately from double-quotes used as escape characters.  Handy!

Edit:
Unfortunately you can't just set "" as an escape character when creating the dataset, because even though it can process the CSV correctly when set this way, ADF will give you an error when importing the schema for the CSV:

"CSV serilization setting escapeChar cannot be more than one character"

So the trick is to leave it as backslash, just to import the schema, and then change it to double-quotes afterwards, since this seems to be the only step that complains about this escape character.

Comments

Popular posts from this blog

Using WiX to create an event source during install of a .NET framework project

Edit: so I guess I wasn't the only one confused with this stuff, as it's been my most popular post by far!  If I've helped you out or saved you some time, please let me know in the comments :)

In order for this to work, you have to add references to WixUtilExtension and WixNetFxExtension to your WiX project.  Once that's done, add this inside a <Component> element:

<Util:EventSourcexmlns:Util="http://schemas.microsoft.com/wix/UtilExtension"Name="EVENTSOURCEGOESHERE"Log="Application"EventMessageFile="[NETFRAMEWORK40FULLINSTALLROOTDIR]EventLogMessages.dll" />
Obviously replace EVENTSOURCEGOESHERE with your event source name.  NETFRAMEWORK40FULLINSTALLROOTDIR is a property set by the WixNetFxExtension which stores the path to the .NET framework v4 directory, but you can replace this with the corresponding property for the directory containing the relevant EventLogMessages.dll file.  So if you're using the .NET framewo…

Using Log4Net to use both event log and a rolling log file

Here's the config section, note that the applicationNameproperty in the EventLogAppender needs to be the same as the event source in the windows event log that you want to log to.  If the event source doesn't exist, that appender won't work.  In this particular project I create that during install using WiX (which is covered in another post)

  <log4netdebug="true">
    <appendername="RollingLogFileAppender"type="log4net.Appender.RollingFileAppender">      <filevalue="log.txt" />      <datePatternvalue="dd-MM-yyyy" />      <appendToFilevalue="true" />      <locationinfovalue="false" />      <rollingStylevalue="Size" />      <maximumFileSizevalue="1MB" />      <maxSizeRollBackupsvalue="10" />      <staticLogFileNamevalue="true" />      <layouttype="log4net.Layout.PatternLayout">        <conv…

"A section using 'configSource' may contain no other attributes or elements" error after installing Application Insights

After installing the Application Insights nuget package to an Umbraco solution, you'll get this error:

A section using 'configSource' may contain no other attributes or elements

<ExamineLuceneIndexSets configSource="config\ExamineIndex.config" />
     <log4net configSource="config\log4net.config">
         <root>
             <level value="ALL" />
             <appender-ref ref="aiAppender" />
Source File: \project\web.config

This happens because part of the Application Insights installation process adds a <log4net> section to web.config.  Which would make sense, except Umbraco already has a <log4net> section in /config/log4net.config.  So as you can imagine, the solution is to manually move everything its added into that file. Unfortunately you can't just copy/paste the whole lot, but it's not particularly complicated:


Move <appender-ref ref="aiAppender" /> into the lo…