Merchants often need to move large amounts of data (such as product catalogs, orders or customer data) from one system to another for processing or organization. The issue is that how one system structures and accepts data might not be the same as the destination system. How do you transfer and share data between systems then (other than entering it by hand)?
many rely on integration that uses CSV (comma-separated values) files to export and import text files of their data between systems. CSVs are a universal file type that many systems support and merchants feel comfortable managing. While file integration is often an integration option, it also has its limitations. As you evaluate this data integration approach, this article considers why file-based integration via CSV files can be better than API-based integration.
What is Flat File Integration?
File-based integration is when either your source data and/or your destination data must be represented in a file (like a CSV file). Some systems require this as an alternative to an API or a direct database connection. File-based integration allows data to be passed between different databases, or operating systems, without necessarily connecting directly to them.
While file-based integration can include different file types (e.g. XML, JSON), this article focuses on CSV file integration in particular. A CSV file can only have a single file in it, though sometimes file-based integration requires you to combine elements from multiple files. They’re called ‘flattened’ files where a single record is defined across multiple rows.
CSVs are used to store tabular data. This means they can be imported and exported from programs that store data in tables, like Microsoft Excel or other spreadsheet software, to become a readable format. It’s important to note though that CSVs can’t include formulas or any of the formatting features Excel sheets provide. If you save an Excel document as a CSV, all that extra stuff will be stripped out.
Merchants have successfully used CSV file integration for decades to move data between systems for all sorts of reasons. However, in the past 10 – 15 years, APIs have emerged as a more “modern” way for two systems to communicate or integrate with each other. What’s the difference?
API Integration vs CSV Integration Approach
An API (Application Program Interface) is a set of code that allows two software programs to communicate with each other. A software’s API spells out a standard, predictable way for a developer to write a program requesting services from another system or application.
A REST (REpresentational State Transfer) API accepts HTTP requests to GET, PUT, POST or DELETE data. REST represents an architectural style and approach to communication used in most modern web services development. Therefore, it’s often used to build APIs that allow users to connect and interact with cloud services.
SOAP (Simple Object Access Protocol) is another protocol used to develop APIs. It is less common in new APIs than REST, but accomplishes basically the same thing.
API integration lets you connect two systems by interpreting and translating data between two systems via their APIs. It enables data flows such as “GET Product” from a known API endpoint in an ERP and “PUT Product” to a known API endpoint in an eCommerce platform.
While API integration is a more modern approach, it’s not always the right solution (or possible) for merchants looking to integrate their systems. Here’s an in-depth look over why file-based integration can be better than API integration.
Why File-based Integration Can Be Better than API Integration
There are scenarios where CSV integration might make the most sense for your business. See below some of the main advantages of this system integration approach.
Keep and Connect Systems without APIs
Not all software systems have available APIs for integration. This is often the case for older “legacy” software, but is even the case for modern systems like Salesforce Commerce Cloud (Demandware) eCommerce platform. Integrating with these systems must be done without using APIs, meaning your best option might be file-based integration.
Most systems at least accept CSV via a manual import process, no matter the age of the software. Some (again, like Salesforce Commerce Cloud) offer the ability to automate the CSV import by pointing to an SFTP server.
It’s important to note that many merchants will use this integration approach instead of taking on sometimes costly projects to upgrade or replace these API-less systems.
Non-proprietary
CSV files are non-proprietary to any specific software vendor. Creating and sharing a CSV is a generic way to handle data. This makes them easy to interact with and share among people and systems. You can create CSVs by leveraging Excel or other spreadsheet software and most people are comfortable handling them this way. Regardless of the specific software you’re using, being able to import and export a CSV file is not unique to that software or brand.
Easier to Create Files
Since CSVs are plain-text files, it’s easier for a web developer or other members of your team to create, view, and validate the data as a spreadsheet. All you need is a header row at the top and subsequent rows of data. You’ll be able to manipulate the data to organize it as you need. It’s then easy to share this data across different systems.
There’s no hierarchal relationship between data, unless you artificial represent that hierarchy with multiple CSV files. If you need to do this, it does make CSV-based integration more complex.
Common examples of CSV file merchants might create are pricing sheets, customer lists, product data, order data, and discounts to share between their eCommerce and ERP or POS system.
Standard Communication between Multiple Sources
CSV files are also a standard way to communicate data, which makes them a great tool to collect and compile data from multiple sources or systems. While the format of the data (which columns in which order) will likely vary, a CSV is a well-established, almost universally supported file type.
For example, you might collect product catalogs from a handful of different suppliers. Regardless of what system they use to master their product data, they should be able to export an CSV of all their product data. You then can pick up and import this CSV into your system. You don’t need to worry about integrating to their system directly. You may not even have direct API integration as an option due to security or other concerns.
Limitations of File-based Integration
There are drawbacks to this integration approach that you should consider, especially when comparing this approach to an alternative like API-based integration.
Security Responsibilities
When automating CSV file transfers, you’re essentially dropping a file on a server that is then picked up by another system. Who is maintaining the servers? How are you keeping your data secure throughout that process?
File-based integration requires merchants to think about security in a different way. While you want your data to be secure throughout the transfer, you can’t make access too difficult either. Otherwise, it’ll be very hard to communicate with other business systems. It’s your responsibility to find that balance and to make sure all the people (and no more) can access these files as needed.
Batch Syncing
When integrating systems, most merchants want a real-time, or instantaneous, sync of their data between systems.
This would be especially important for data like eCommerce items with quickly changing inventory availability or orders that must be fulfilled for two-day shipping. As soon as an order is placed online, you want to reduce that inventory quantity across all your sales channels and start order processing. Delays in this sync, even as small as 30 minutes, can cause overselling during peak sales times such as flash sales or holidays. They can also slow order fulfillment causing you to miss guaranteed delivery dates. Your customer suffers the consequences.
Most CSV file integration by default cannot sync instantaneously. This is because it’s not really possible to continually write new lines to a CSV while some other system is continually reading new lines from the CSV. You have to write a bunch of lines to a file, drop that file to be picked up, then start a new file.
This means whether your rules for starting a new file are “number of rows” or “period of time”, there will be some kind of delay in the process.
Think of CSV file integration another way: You aren’t delivered mail on-demand as it comes like an email would arrive. Once a day you receive potentially a batch of letters, and you send them out as a batch as well.
Handling Complex Data Structures
While a CSV’s flat file structure can be an advantage, it can also be its downfall. By design, CSV data is not hierarchical nor object-oriented, meaning they have a ubiquitous structure. For some merchants, this is a problem when you have multiple layers of data that relate to each other. Think about how you would need to represent orders with multiple line items or a product that comes in multiple sizes or colors.
You’ll end up creating multiple CSV files to try to represent this relationship between your data. It can be challenging to try to manage and organize large sets of data.
Validating Data
A crucial step in system integration is validating the data you’re transferring. You don’t want to send inaccurate or bad data to another system, especially since it can cause “invalid” errors. With CSV file integration, data validation is a manual process for its users.
When you’re dealing with several CSVs with hundreds of rows of data, validation becomes difficult to manage. You won’t be just able to “eyeball” that everything looks alright to send. Instead, users will rely on formulas or Macros written in Excel to highlight cells if they’re wrong. Then, you need to go in by hand to fix any errors. This process is time-consuming and faulty when reviewing large amounts of data.
In some cases, the system into which you are loading the CSV data may provide data validation on the imported CSV, but this has drawbacks as well. The existence and robustness of these features is not consistent platform to platform. And, often they don’t manage the import transaction well on a failure. That could mean that if one row out of 100,000 was incorrect, the entire import could fail.
Recovering from Errors During Transfer of Data
CSV file integration relies on dropping and picking up files on servers. Throughout that process, there’s a lot that can go wrong such as the file being corrupted or the server or file system being down. There are also not usually checks for invalid characters, missing data descriptions, or other file content issues. What happens when you send a file with these problems? How do you plan on fixing it?
Managing Compute Resources for Data Processing
Depending on how the code is written to integrate your systems, file-based integration can take a lot of CPU and RAM to read large data sets across multiple files.
Character Encoding Issues
Each system has its own default for encoding characters, for example UTF-8. When sharing text files (like CSV), merchants must consider that the files need to use the same encoding characters. If not, one of your CSV file can be rejected or the data may appear incorrect because the system isn’t able to read your characters correctly. For example, the destination system might not be able to read specialized apostrophes or characters for Asian languages.
Learn More about CSV Integration
Merchants have options when it comes to integrating their systems. And for many, file-based integration through CSVs can be the right option. However, this integration approach doesn’t come without its disadvantages that you should consider.
If you’re interested in learning more about how to automate your business through file-based integration, learn more about nChannel’s file-transfer solution, which is built to directly address all the challenges described in this document.
Only a marketing wonk would give reasons why tossing CSV files over the fence is “better” than API integration. One of the reasons given is that they are not “proprietary”. Well, neither are API integrations. In fact, because APIs are published interfaces that describe functions, data types, return types, cardinality, and other things, they become non-proprietary.