If enabled, all events created by the Agent will have a `sort_info` key
whose value is a hash containing the keys `position` and `count`.
This overrides #1768.
* Initial draft of PhantomJsCloudAgent
Generates event with url for fetching html/plainText content
* Add options
* Pass in event instead of url
Fix hash syntax
Remove whitespace
Add mode merge
* Add some tests
* Style changes
- Add link to wiki entry for manually creating agent with full set of
options
- Raplace hand-made mocking for web requests with Webmock
- Stop overriding internal methods of Agent like `interpolated`, because
that made the specs not reflect actual behavior
The SAX parser Feedjira uses (Nokogiri::XML::SAX) tries to detect the
encoding of a document from the content even if it is already known
and given. This results in a content being decoded twice by
WebRequestConcern and the SAX parser if its encoding is declared in
both the Content-Type header and the XML declaration.
This commit makes RssAgent remove the `encoding` attribute from the
XML declaration of a document if the encoding is already known by the
Content-Type header.
Fixes#1797.
If given, it is used as a Liquid template for each event created by the
Agent, instead of directly emitting the results of extraction as events.
An existing spec needs to be fixed because WebsiteAgent now has the
`template` option, which may not be a hash of hashes.
Previously, WebsiteAgent always assumed that a content with no charset
specified in the Content-Type header would be encoded in UTF-8. This
enhancement is to make use of the encoding detector implemented in
Nokogiri for HTML/XML documents, instead of blindly falling back to
UTF-8.
When the document `type` is `html` or `xml`, WebsiteAgent tries to
detect the encoding of a fetched document from the presence of a BOM,
XML declaration, or HTML `meta` tag.
This fixes#1742.
The `as_object` returns the received data/object as is without casting it to a string like liquid normally does. It
can be used as a JSONPath replacement or to emit result of a Liquid filter chain as an array.
`catch` and `throw` needs to be used to break out of Liquid render chain. Liquid aggregates the output of every
expression an array and [joins](https://github.com/Shopify/liquid/blob/v3.0.6/lib/liquid/block.rb#L147) it together that
join makes it impossible to get anything else than a string out of a Liquid template.