How can I extract table rows from an HTML email body?
A common task is to extract line-items from order emails. This can be done with a filter dedicated to extracting table rows from HTM emails.
This is how you set up a parsing rule which extracts line-items from an HTML email:
- Create a new parsing rule and select Body > HTML as a source
- Add the following filter "Extract Tabular Data > Get Tables from HTML"
This should give you a list of all table rows inside the email. In most cases you also need to add additional filters which will remove all unnecessary rows and leave you with just the rows representing the line-items of the order. To do this, try adding the following filters:
- Filter rows by cell quantity
- Filter rows by content type
A typical usage for there kind of filter would be "Only keep rows with four columns" (Cell Quantity Filter) because your line-items are all having four data columns. Another way of filtering out the correct rows would be something like "The forth row needs to be a $ amount" (Cell Content Filter). By applying filters like that you should end up with just your line-items remaining.
If you want more information, please continue reading on our blog: Extracting repeating text blocks and line items from email