Commit graph

2632 commits

Author SHA1 Message Date
Akinori MUSHA
5f5f3cd38f Do not err if headers is a valid headers hash 2016-11-27 13:38:23 +09:00
Akinori MUSHA
9074f3115e Make sure status is an integer when set 2016-11-27 13:38:23 +09:00
Akinori MUSHA
a94cd7fd6d Allow an empty or null base URI 2016-11-27 13:38:23 +09:00
Akinori MUSHA
3a0c9e6274 Disable automatic URL normalization and absolutization on url
This was discussed in #1766.

For backward compatibility, existing WebsiteAgents with a key named
`url` will be given a `template` to resolve `url`.
2016-11-27 13:38:23 +09:00
Akinori MUSHA
3d91469733 Make WebsiteAgent merge template with the results of extract (#1816)
A new extraction option `hidden` is added so that keys with it gets
excluded from the final payloads while they can be used in `template`.
2016-11-27 13:33:24 +09:00
bobbysteel
f530305edc Add class of service chooser for Google Flights Agent (#1778)
* Add class of service chooser

* Add cabin chooser test

* Fix preferredCabin

* Per @cantino feedback taking out check
2016-11-26 15:13:00 -05:00
Akinori MUSHA
7ac691652b Spec that force_encoding works with encoding declaration in RssAgent 2016-11-26 13:26:09 +09:00
Akinori MUSHA
b8d88aa9a3 Merge pull request #1813 from cantino/fix_decoding_in_rss_agent
Fix a double-decoding problem in RssAgent
2016-11-23 09:05:46 +09:00
Dominik Sander
5d69bd2d93 Merge pull request #1804 from dsander/cache-agent-type-dropdown
Cache Agent type select options in Agent#new
2016-11-22 21:04:29 +01:00
Dominik Sander
6daa6cc75e Cache Agent type select options in Agent#new
Agent#new was by far the slowest action of all controllers. When using
many Agent gems the response time goes up to a point where the user
starts to wonder if something is going wrong. By caching the Agent
description the response time goes down from about 1 second to 100ms
in development.
2016-11-22 20:02:27 +01:00
Akinori MUSHA
aa6d8be697 Assume that an XML declaration is at the beginning of a document 2016-11-23 01:17:46 +09:00
Akinori MUSHA
0b3700999b Fix a double-decoding problem in RssAgent
The SAX parser Feedjira uses (Nokogiri::XML::SAX) tries to detect the
encoding of a document from the content even if it is already known
and given.  This results in a content being decoded twice by
WebRequestConcern and the SAX parser if its encoding is declared in
both the Content-Type header and the XML declaration.

This commit makes RssAgent remove the `encoding` attribute from the
XML declaration of a document if the encoding is already known by the
Content-Type header.

Fixes #1797.
2016-11-22 12:14:28 +09:00
Andrew Cantino
6fb8fe2292 Increase default database pool size to 20 (#1805) 2016-11-20 09:48:34 -05:00
Akinori MUSHA
c575af959b Merge pull request #1769 from cantino/website_agent_repeat_option
Add a `repeat` option for extractors to WebsiteAgent
2016-11-20 19:16:36 +09:00
The Doctor
5e1191534c Fixed the online documentation for the Weather Agent class. (#1803)
Signed-off-by: The Doctor <drwho@virtadpt.net>
2016-11-19 10:39:15 -05:00
Akinori MUSHA
bd9455d5d0 Add a repeat option to extractors
This allows user to include a value that only appears once in a content
in all events created from the content.
2016-11-18 18:47:32 +09:00
Akinori MUSHA
3a66c152ef Make extract_each prepare a storage for results 2016-11-18 16:34:45 +09:00
Dominik Sander
e26976be76 Merge pull request #1792 from bugdone/master
Fix typos in docker documentation
2016-11-14 23:04:50 +01:00
bugdone
a51d8169dc Fix typos in docker documentation 2016-11-14 22:24:35 +02:00
Andrew Cantino
770706463b Prevent submit from disabling on invalid json (#1790) 2016-11-13 15:21:43 -05:00
Andrew Cantino
085473263f Remove additional nitrous files (#1791) 2016-11-13 15:21:30 -05:00
Andrew Cantino
fca06d6ec2 Nitrous.io is shutting down (#1789) 2016-11-13 14:35:25 -05:00
Akinori MUSHA
b6c1e908c8 Update nokogiri to 1.6.8.1 2016-11-11 15:35:24 +09:00
Akinori MUSHA
74077b0ad4 Auto-focus on Agent Type when creating an agent 2016-11-09 23:25:55 +09:00
Akinori MUSHA
486246e63c Add "image" to Event Description 2016-11-07 12:47:13 +09:00
Akinori MUSHA
1e70b31e7f Merge pull request #1770 from cantino/revert-1071
Revert the special treatment for CDATA introduced in #1071
2016-11-03 12:14:42 +09:00
Akinori MUSHA
50123dca53 Fix event_description broken in full JSON mode or without a template 2016-11-02 21:47:01 +09:00
Akinori MUSHA
898e3d8edb Revert the special treatment for CDATA introduced in #1071 2016-11-02 19:35:16 +09:00
Akinori MUSHA
07effe5eb4 Merge pull request #1766 from cantino/fix_website_agent_url_handling
Fix `url` handling of WebsiteAgent
2016-11-02 13:36:38 +09:00
Akinori MUSHA
e50b8e0d5c Document that url in a created event is automatically resolved 2016-11-02 11:14:02 +09:00
Akinori MUSHA
cc16e854b3 Fix a problem in resolving the url key of a created event
The `url` parameter of handle_data() could hold a string or nil when
invoked from handle_event_data(), in which case resolving `url` in a
created event would fail with a type error.  Moreover, the logic did not
have any guard for URI errors.  This commit should fix them.

Fixes #1765.
2016-11-02 11:14:02 +09:00
Akinori MUSHA
4fe35b2a1f Reproduce #1765 2016-11-01 22:29:56 +09:00
Akinori MUSHA
91f096b16f Merge pull request #1743 from cantino/website_agent_can_interpolate_after_extraction
WebsiteAgent can interpolate after extraction

Incorporating feedback from @cantino and @dsander.
2016-11-01 20:20:37 +09:00
Dominik Sander
e3f1429a37 Merge pull request #1764 from strugee/http-to-https
Fix Stubhub test failures
2016-11-01 09:43:22 +01:00
Alex Jordan
651eb50729 Fix another Stubhub HTTP URL 2016-10-31 20:58:06 -07:00
Alex Jordan
77da54ea0c Convert a bunch of HTTP links to HTTPS (#1757) 2016-10-31 19:21:03 -04:00
Akinori MUSHA
58fabb885c Add a new Liquid filter rebase_hrefs 2016-10-29 20:40:52 +09:00
Akinori MUSHA
8b897f5da3 Add Liquid variables _response_.url and _url_ to WebsiteAgent 2016-10-29 20:40:51 +09:00
Akinori MUSHA
fe35df8752 Add a new option template to WebsiteAgent
If given, it is used as a Liquid template for each event created by the
Agent, instead of directly emitting the results of extraction as events.

An existing spec needs to be fixed because WebsiteAgent now has the
`template` option, which may not be a hash of hashes.
2016-10-29 20:40:51 +09:00
Andrew Cantino
9a3290ef40 Language changes 2016-10-28 19:05:46 -04:00
Akinori MUSHA
faa2789a0c Fix the order of receivers in the DotHelper specs
This should fix occasional build failure on CI.
2016-10-27 16:31:24 +09:00
Akinori MUSHA
4f93db60e7 Merge pull request #1754 from cantino/ignore_empty_author
Ignore empty author and link entries in RssAgent.

Fixes #1753.
2016-10-27 13:07:21 +09:00
Akinori MUSHA
1e14358648 Merge pull request #1751 from cantino/encoding_detection
Improve encoding detection in WebsiteAgent
2016-10-27 13:00:56 +09:00
Akinori MUSHA
50b5833a3f Improve encoding detection in WebsiteAgent
Previously, WebsiteAgent always assumed that a content with no charset
specified in the Content-Type header would be encoded in UTF-8.  This
enhancement is to make use of the encoding detector implemented in
Nokogiri for HTML/XML documents, instead of blindly falling back to
UTF-8.

When the document `type` is `html` or `xml`, WebsiteAgent tries to
detect the encoding of a fetched document from the presence of a BOM,
XML declaration, or HTML `meta` tag.

This fixes #1742.
2016-10-27 13:00:37 +09:00
Akinori MUSHA
4d10132709 Fix a bug where an empty <link> is wrongly parsed
Due to a problem in sax-machine's internals, an empty `<link/>` in RSS
would be parsed to JSON as `{ "href": "no_buffer" }`.  Now empty
`<link/>` elements in RSS and ATOM are simply ignored just like other
collection elements like `<category>`.
2016-10-27 09:17:46 +09:00
Akinori MUSHA
2bb97b53bc Add failing specs for empty <link> elements 2016-10-27 09:17:40 +09:00
Akinori MUSHA
5f5e246552 Use Struct#each_pair 2016-10-27 08:13:43 +09:00
Akinori MUSHA
e5c938aa85 Exclude empty entries from authors 2016-10-27 08:12:48 +09:00
Akinori MUSHA
852f39d480 Rescue error from Mail::Address#name and #address
`Mail::Address.new('')` does not raise any error but calling `name` on
the created instance does.
2016-10-27 08:11:10 +09:00
Akinori MUSHA
445665ee3a Add a failing test for #1753 2016-10-27 08:09:57 +09:00