PortiBlog

Confusing RSS connector in Flow

4 januari 2019

During the holidays I finally found some time to play around with flow again. Currently I have running a recipe in IFTT that posts a tweet whenever I finish a book on Goodreads. The recipes uses a RSS trigger and a Twitter connector, and thus is pretty straight forward. To play around with that in Flow sounded pretty straight forward. However the RSS trigger is quite limited in Flow as I found out in the first few minutes. Flow provides the ‘when a feed item is published’ trigger that fires whenever a new item is added to a RSS feed. As Goodreads provides an RSS feed this sounded like a great way to start with. While the triggers does pickup the changes it does not return the full item, so you might be missing information when you try to process the data.

 

Flow RSS trigger

The flow RSS connector is standard connector that you can use to retrieve feed information and trigger a flow to run when new items are published to an RSS feed. Updates to existing data will not trigger the flow. As you can read in the documentation the trigger returns a wrapper object that contains all feed items. While your feed might return any RSS data the FeedItem that is returned in Flow will only return the following values:

  • Feed ID
  • Feed categories
  • Feed copyright information
  • Feed links
  • Feed published on
  • Feed summary
  • Feed title
  • Feed updated on
  • Primary feed link

 

In most cases the title and summary should contain the information you need. However if you are using a Goodreads RSS you will only be returned the title of your book and some HTML in the summary but no other information. So imagine that you would hit the Goodreads RSS for read books. When you use a feedreader or visit the source of the feed you will see that it will return an array of items. Each item looks like the following:

<item>
<guid><![CDATA[https://www.goodreads.com/review/show/2622187455?utm_medium=api&utm_source=rss]]></guid>
<pubDate><![CDATA[Wed, 02 Jan 2019 01:41:25 -0800]]></pubDate>
<title>Icarus (Benny Griessel, #5)</title>
<link><![CDATA[https://www.goodreads.com/review/show/2622187455?utm_medium=api&utm_source=rss]]></link>
<book_id>28257568</book_id>
<book_image_url><![CDATA[https://images.gr-assets.com/books/1450689371s/28257568.jpg]]></book_image_url>
<book_small_image_url><![CDATA[https://images.gr-assets.com/books/1450689371s/28257568.jpg]]></book_small_image_url>
<book_medium_image_url><![CDATA[https://images.gr-assets.com/books/1450689371m/28257568.jpg]]></book_medium_image_url>
<book_large_image_url><![CDATA[https://images.gr-assets.com/books/1450689371l/28257568.jpg]]></book_large_image_url>
<book_description><![CDATA[17 december. Het lichaam van Ernst Richter, internetondernemer en oprichter van de controversiële website Alibi.co.za, wordt aangetroffen in een ondiep graf in de zandduinen vlak bij Parklands. Hij werd sinds een maand vermist.<br /><br />Niemand binnen de politiemacht wil de zaak aannemen vanwege de politieke consequenties voor de betrokkenen én de onherroepelijke mediatsunami die zal volgen op de moord op een persoon zo omstreden als Richter. En dus mag de speciale eenheid 'de Valken' ermee aan de slag.<br /><br />Zelfs onder normale omstandigheden zou deze zaak al genoeg kopzorgen veroorzaken voor Bennie Griessel en Vaughn Cupido. De medewerkers van Alibi.co.za behoren niet bepaald tot het mededeelzame soort en de lijst van mensen die Richter met hun blote handen willen wurgen is lang. Maar de omstandigheden zijn allesbehalve normaal. Griessel heeft de fles weer ontdekt en Cupido is verliefd - op een van de hoofdverdachten.<br /><br />En op 24 december zet de verklaring van een jonge wijnboer de zaak volledig op zijn kop.<br /><br />Kerstmis zal nooit meer hetzelfde zijn.]]></book_description>
<book id="28257568">
<num_pages>395</num_pages>
</book>
<author_name>Deon Meyer</author_name>
<isbn>9044973789</isbn>
<user_name>Albert-Jan</user_name>
<user_rating>0</user_rating>
<user_read_at></user_read_at>
<user_date_added><![CDATA[Wed, 02 Jan 2019 01:41:25 -0800]]></user_date_added>
<user_date_created><![CDATA[Mon, 10 Dec 2018 14:04:59 -0800]]></user_date_created>
<user_shelves>currently-reading, 2018</user_shelves>
<user_review></user_review>
<average_rating>3.50</average_rating>
<book_published>2015</book_published>
<description>
<![CDATA[
<a href="https://www.goodreads.com/book/show/28257568-icarus?utm_medium=api&amp;utm_source=rss"><img alt="Icarus (Benny Griessel, #5)" src="https://images.gr-assets.com/books/1450689371s/28257568.jpg" /></a><br/>
author: Deon Meyer<br/>
name: Albert-Jan<br/>
average rating: 3.50<br/>
book published: 2015<br/>
rating: 0<br/>
read at: <br/>
date added: 2019/01/02<br/>
shelves: currently-reading, 2018<br/>
review: <br/><br/>
]]>
</description>
</item>

However when checkout the response you get from the trigger it will contain only the following information:

{
"id": "https://www.goodreads.com/review/show/2622187455?utm_medium=api&utm_source=rss",
"title": "Icarus (Benny Griessel, #5)",
"primaryLink": "https://www.goodreads.com/review/show/2622187455?utm_medium=api&utm_source=rss",
"links": [
"https://www.goodreads.com/review/show/2622187455?utm_medium=api&utm_source=rss"
],
"updatedOn": "0001-01-01 00:00:00Z",
"publishDate": "2019-01-02 09:41:25Z",
"summary": "\n \n <a href=\"https://www.goodreads.com/book/show/28257568-icarus?utm_medium=api&amp;utm_source=rss\"><img alt=\"Icarus (Benny Griessel, #5)\" src=\"https://images.gr-assets.com/books/1450689371s/28257568.jpg\" /></a><br/>\n author: Deon Meyer<br/>\n name: Albert-Jan<br/>\n average rating: 3.50<br/>\n book published: 2015<br/>\n rating: 0<br/>\n read at: <br/>\n date added: 2019/01/02<br/>\n shelves: currently-reading, 2018<br/>\n review: <br/><br/>\n \n ",
"copyright": "",
"categories": []
}

With that information it is quite hard to post a tweet as you are missing the image, and there is no information on user rating and the Author is saved to the summary.

How to retrieve all information from RSS

In order to retrieve all the information you require from your RSS feed you should do four actions:

  • Step 1 is to have RSS trigger that fires whenever there is an update
  • Step 2 is an HTTP call to retrieve the full XML for the RSS feed again
  • Step 3 is to create an XPath filter that allows you to retrieve the correct item. We require an Xpath as there might have been multiple updates on the RSS feed and we need the data for the trigger.
  • Step 4 is to retrieve the data based on the XPath

Flow actions to retrieve full RSS body

 

The first two actions are pretty straight forward. Just add a RSS trigger action and use the same URL in a HTTP Get call. Please be aware that the HTTP action will become a premium connector :-(.

Flow trigger and action to retrieve data

Use XPath to get your data

Now that we have all our XML data retrieved from the RSS feed in the HTTP Get Action we can use a XPath filter. This XPath filter for Goodreads looks as the following:

concat('(//rss/channel/item[title = "', triggerBody()?['title'], '"])')

As you can see it will construct a new string with the //rss/chanel/item[title="booktitle"] XPath query. That way we you can be sure that you retrieve the correct XML. Now the next step will be to retrieve the book data element. That will be to simply apply the XPath query to the data retrieved from the HTTP request:

xml(xpath(xml(body('HTTP')), variables('XPathFilter'))[0])

From left to right you can see that we first make sure the response is returned as xml by adding a xml wrapper around the XPath. Second the XPath query can only be done against an XML element so the HTTP Body is wrapped in a xml wrapper as well. And as an XPath can return multiple values make sure to add an [0] to retrieve the first item that matches the XPath.

Flow action to filter data

The response of the Retrieve Book data action is a single item element. This element then can be used to execute additional XPath queries against to retrieve properties that are normally not available in your RSS flow.  For instance in the Goodreads RSS you can then create a new variable or inject directly into your action the following XPath:

xpath(xml(variables('BookXmlData')), 'string(/item/user_rating)')

This will retrieve the user_rating for the book you have just added to your read shelve. You can use the XPath to retrieve any property or element from the <item> element you get returned. You can find more resources on how to use XPath on w3schools.

Using flow for tweets

I use the above flow to auto tweet whenever I finish a book on Goodreads. Before I used a recipe in IFTTT that did pretty much the same but didn’t require any hacking. The downside of IFTT is that you cannot tweak it at all. However the plus side is that you do get some better results in the RSS action than flow provides out the box. However now that I figured out how to retrieve all XML properties using XPath I feel I can use the same approach for some other scenario’s as well. I am a bit sad that the HTTP trigger will move to premium  though.

Originally posted at: https://www.cloudappie.nl/flow-confusing-rss-connector/

Submit a comment

Albert-Jan Schot

Albert-Jan Schot

CTO & MVP

Tags

Blog
Techblog