Adds credentials user, pass for uri. If realm is set the credentials are used only for that realm. If realm is not set the credentials become the default for any realm on that URI.
domain and realm are exclusive as NTLM does not follow RFC 2617. If domain is given it is only used for NTLM authentication.
# File lib/mechanize.rb, line 653 653: def add_auth uri, user, password, realm = nil, domain = nil 654: @agent.add_auth uri, user, password, realm, domain 655: end
Adds page to the history
# File lib/mechanize.rb, line 1237 1237: def add_to_history(page) 1238: @agent.history.push(page, @agent.resolve(page.uri)) 1239: @history_added.call(page) if @history_added 1240: end
NOTE: These credentials will be used as a default for any challenge exposing your password to disclosure to malicious servers. Use of this method will warn. This method is deprecated and will be removed in mechanize 3.
Sets the user and password as the default credentials to be used for HTTP authentication for any server. The domain is used for NTLM authentication.
# File lib/mechanize.rb, line 630 630: def auth user, password, domain = nil 631: caller.first =~ /(.*?):(\d+).*?$/ 632: 633: warn At #{$1} line #{$2}Use of #auth and #basic_auth are deprecated due to a security vulnerability. 634: 635: @agent.add_default_auth user, password, domain 636: end
Path to an OpenSSL server certificate file
# File lib/mechanize.rb, line 1005 1005: def ca_file 1006: @agent.ca_file 1007: end
Sets the certificate file used for SSL connections
# File lib/mechanize.rb, line 1012 1012: def ca_file= ca_file 1013: @agent.ca_file = ca_file 1014: end
An OpenSSL client certificate or the path to a certificate file.
# File lib/mechanize.rb, line 1019 1019: def cert 1020: @agent.cert 1021: end
Sets the OpenSSL client certificate cert to the given path or certificate instance
# File lib/mechanize.rb, line 1027 1027: def cert= cert 1028: @agent.certificate = cert 1029: end
An OpenSSL certificate store for verifying server certificates. This defaults to the default certificate store for your system.
If your system does not ship with a default set of certificates you can retrieve a copy of the set from Mozilla here: curl.haxx.se/docs/caextract.html
(Note that this set does not have an HTTPS download option so you may wish to use the firefox-db2pem.sh script to extract the certificates from a local install to avoid man-in-the-middle attacks.)
After downloading or generating a cacert.pem from the above link you can create a certificate store from the pem file like this:
cert_store = OpenSSL::X509::Store.new cert_store.add_file 'cacert.pem'
And have mechanize use it with:
agent.cert_store = cert_store
# File lib/mechanize.rb, line 1053 1053: def cert_store 1054: @agent.cert_store 1055: end
Sets the OpenSSL certificate store to store.
See also #
# File lib/mechanize.rb, line 1062 1062: def cert_store= cert_store 1063: @agent.cert_store = cert_store 1064: end
Are If-Modified-Since conditional requests enabled?
# File lib/mechanize.rb, line 660 660: def conditional_requests 661: @agent.conditional_requests 662: end
Disables If-Modified-Since conditional requests (enabled by default)
# File lib/mechanize.rb, line 667 667: def conditional_requests= enabled 668: @agent.conditional_requests = enabled 669: end
Follow HTML meta refresh and HTTP Refresh headers. If set to :anywhere meta refresh tags outside of the head element will be followed.
# File lib/mechanize.rb, line 696 696: def follow_meta_refresh 697: @agent.follow_meta_refresh 698: end
Controls following of HTML meta refresh and HTTP Refresh headers in responses.
# File lib/mechanize.rb, line 704 704: def follow_meta_refresh= follow 705: @agent.follow_meta_refresh = follow 706: end
Follow an HTML meta refresh and HTTP Refresh headers that have no “url=” in the content attribute.
Defaults to false to prevent infinite refresh loops.
# File lib/mechanize.rb, line 714 714: def follow_meta_refresh_self 715: @agent.follow_meta_refresh_self 716: end
Alters the following of HTML meta refresh and HTTP Refresh headers that point to the same page.
# File lib/mechanize.rb, line 722 722: def follow_meta_refresh_self= follow 723: @agent.follow_meta_refresh_self = follow 724: end
Is gzip compression of responses enabled?
# File lib/mechanize.rb, line 729 729: def gzip_enabled 730: @agent.gzip_enabled 731: end
Disables HTTP/1.1 gzip compression (enabled by default)
# File lib/mechanize.rb, line 736 736: def gzip_enabled=enabled 737: @agent.gzip_enabled = enabled 738: end
Connections that have not been used in this many seconds will be reset.
# File lib/mechanize.rb, line 743 743: def idle_timeout 744: @agent.idle_timeout 745: end
Sets the idle timeout to idle_timeout. The default timeout is 5 seconds. If you experience “too many connection resets”, reducing this value may help.
# File lib/mechanize.rb, line 751 751: def idle_timeout= idle_timeout 752: @agent.idle_timeout = idle_timeout 753: end
When set to true mechanize will ignore an EOF during chunked transfer encoding so long as at least one byte was received. Be careful when enabling this as it may cause data loss.
Net::HTTP does not inform mechanize of where in the chunked stream the EOF occurred. Usually it is after the last-chunk but before the terminating CRLF (invalid termination) but it may occur earlier. In the second case your response body may be incomplete.
# File lib/mechanize.rb, line 765 765: def ignore_bad_chunking 766: @agent.ignore_bad_chunking 767: end
When set to true mechanize will ignore an EOF during chunked transfer encoding. See ignore_bad_chunking for further details
# File lib/mechanize.rb, line 773 773: def ignore_bad_chunking= ignore_bad_chunking 774: @agent.ignore_bad_chunking = ignore_bad_chunking 775: end
Are HTTP/1.1 keep-alive connections enabled?
# File lib/mechanize.rb, line 780 780: def keep_alive 781: @agent.keep_alive 782: end
Disable HTTP/1.1 keep-alive connections if enable is set to false. If you are experiencing “too many connection resets” errors setting this to false will eliminate them.
You should first investigate reducing idle_timeout.
# File lib/mechanize.rb, line 791 791: def keep_alive= enable 792: @agent.keep_alive = enable 793: end
An OpenSSL private key or the path to a private key
# File lib/mechanize.rb, line 1078 1078: def key 1079: @agent.key 1080: end
Sets the OpenSSL client key to the given path or key instance. If a path is given, the path must contain an RSA key file.
# File lib/mechanize.rb, line 1086 1086: def key= key 1087: @agent.private_key = key 1088: end
The current logger. If no logger has been set Mechanize.log is used.
# File lib/mechanize.rb, line 798 798: def log 799: @log || Mechanize.log 800: end
Sets the logger used by this instance of mechanize
# File lib/mechanize.rb, line 805 805: def log= logger 806: @log = logger 807: end
Responses larger than this will be written to a Tempfile instead of stored in memory. The default is 100,000 bytes.
A value of nil disables creation of Tempfiles.
# File lib/mechanize.rb, line 815 815: def max_file_buffer 816: @agent.max_file_buffer 817: end
Sets the maximum size of a response body that will be stored in memory to bytes. A value of nil causes all response bodies to be stored in memory.
Note that for Mechanize::Download subclasses, the maximum buffer size multiplied by the number of pages stored in history (controlled by #) is an approximate upper limit on the amount of memory Mechanize will use. By default, Mechanize can use up to ~5MB to store response bodies for non-File and non-Page (HTML) responses.
See also the discussion under #
# File lib/mechanize.rb, line 832 832: def max_file_buffer= bytes 833: @agent.max_file_buffer = bytes 834: end
Length of time to wait until a connection is opened in seconds
# File lib/mechanize.rb, line 839 839: def open_timeout 840: @agent.open_timeout 841: end
Sets the connection open timeout to open_timeout
# File lib/mechanize.rb, line 846 846: def open_timeout= open_timeout 847: @agent.open_timeout = open_timeout 848: end
Parses the body of the response from uri using the pluggable parser that matches its content type
# File lib/mechanize.rb, line 1162 1162: def parse uri, response, body 1163: content_type = nil 1164: 1165: unless response['Content-Type'].nil? 1166: data, = response['Content-Type'].split ';', 2 1167: content_type, = data.downcase.split ',', 2 unless data.nil? 1168: end 1169: 1170: parser_klass = @pluggable_parser.parser content_type 1171: 1172: unless parser_klass <= Mechanize::Download then 1173: body = case body 1174: when IO, Tempfile, StringIO then 1175: body.read 1176: else 1177: body 1178: end 1179: end 1180: 1181: parser_klass.new uri, response, body, response.code do |parser| 1182: parser.mech = self if parser.respond_to? :mech= 1183: 1184: parser.watch_for_set = @watch_for_set if 1185: @watch_for_set and parser.respond_to?(:watch_for_set=) 1186: end 1187: end
OpenSSL client key password
# File lib/mechanize.rb, line 1093 1093: def pass 1094: @agent.pass 1095: end
Sets the client key password to pass
# File lib/mechanize.rb, line 1100 1100: def pass= pass 1101: @agent.pass = pass 1102: end
Posts form to uri
# File lib/mechanize.rb, line 1215 1215: def post_form(uri, form, headers = {}) 1216: cur_page = form.page || current_page || 1217: Page.new 1218: 1219: request_data = form.request_data 1220: 1221: log.debug("query: #{ request_data.inspect }") if log 1222: 1223: headers = { 1224: 'Content-Type' => form.enctype, 1225: 'Content-Length' => request_data.size.to_s, 1226: }.merge headers 1227: 1228: # fetch the page 1229: page = @agent.fetch uri, :post, headers, [request_data], cur_page 1230: add_to_history(page) 1231: page 1232: end
PUT to uri with entity, and setting headers:
put('http://example/', 'new content', {'Content-Type' => 'text/plain'})
# File lib/mechanize.rb, line 477 477: def put(uri, entity, headers = {}) 478: request_with_entity(:put, uri, entity, headers) 479: end
Length of time to wait for data from the server
# File lib/mechanize.rb, line 853 853: def read_timeout 854: @agent.read_timeout 855: end
Sets the timeout for each chunk of data read from the server to read_timeout. A single request may read many chunks of data.
# File lib/mechanize.rb, line 861 861: def read_timeout= read_timeout 862: @agent.read_timeout = read_timeout 863: end
Controls how mechanize deals with redirects. The following values are allowed:
:all, true | All 3xx redirects are followed (default) |
:permanent | Only 301 Moved Permanantly redirects are followed |
false | No redirects are followed |
# File lib/mechanize.rb, line 873 873: def redirect_ok 874: @agent.redirect_ok 875: end
Sets the mechanize redirect handling policy. See redirect_ok for allowed values
# File lib/mechanize.rb, line 883 883: def redirect_ok= follow 884: @agent.redirect_ok = follow 885: end
Maximum number of redirections to follow
# File lib/mechanize.rb, line 890 890: def redirection_limit 891: @agent.redirection_limit 892: end
Sets the maximum number of redirections to follow to limit
# File lib/mechanize.rb, line 897 897: def redirection_limit= limit 898: @agent.redirection_limit = limit 899: end
A hash of custom request headers that will be sent on every request
# File lib/mechanize.rb, line 904 904: def request_headers 905: @agent.request_headers 906: end
Replaces the custom request headers that will be sent on every request with request_headers
# File lib/mechanize.rb, line 912 912: def request_headers= request_headers 913: @agent.request_headers = request_headers 914: end
Makes an HTTP request to url using HTTP method verb. entity is used as the request body, if allowed.
# File lib/mechanize.rb, line 485 485: def request_with_entity(verb, uri, entity, headers = {}) 486: cur_page = current_page || Page.new 487: 488: headers = { 489: 'Content-Type' => 'application/octet-stream', 490: 'Content-Length' => entity.size.to_s, 491: }.update headers 492: 493: page = @agent.fetch uri, verb, headers, [entity], cur_page 494: add_to_history(page) 495: page 496: end
Retry POST and other non-idempotent requests. See RFC 2616 9.1.2.
# File lib/mechanize.rb, line 919 919: def retry_change_requests 920: @agent.retry_change_requests 921: end
When setting retry_change_requests to true you are stating that, for all the URLs you access with mechanize, making POST and other non-idempotent requests is safe and will not cause data duplication or other harmful results.
If you are experiencing “too many connection resets” errors you should instead investigate reducing the idle_timeout or disabling keep_alive connections.
# File lib/mechanize.rb, line 933 933: def retry_change_requests= retry_change_requests 934: @agent.retry_change_requests = retry_change_requests 935: end
Will /robots.txt files be obeyed?
# File lib/mechanize.rb, line 940 940: def robots 941: @agent.robots 942: end
When enabled mechanize will retrieve and obey robots.txt files
# File lib/mechanize.rb, line 948 948: def robots= enabled 949: @agent.robots = enabled 950: end
The handlers for HTTP and other URI protocols.
# File lib/mechanize.rb, line 955 955: def scheme_handlers 956: @agent.scheme_handlers 957: end
Replaces the URI scheme handler table with scheme_handlers
# File lib/mechanize.rb, line 962 962: def scheme_handlers= scheme_handlers 963: @agent.scheme_handlers = scheme_handlers 964: end
Sets the proxy address at port with an optional user and password
# File lib/mechanize.rb, line 1201 1201: def set_proxy address, port, user = nil, password = nil 1202: @proxy_addr = address 1203: @proxy_port = port 1204: @proxy_user = user 1205: @proxy_pass = password 1206: 1207: @agent.set_proxy address, port, user, password 1208: end
SSL version to use. Ruby 1.9 and newer only.
# File lib/mechanize.rb, line 1107 1107: def ssl_version 1108: @agent.ssl_version 1109: end
Sets the SSL version to use to version without client/server negotiation. Ruby 1.9 and newer only.
# File lib/mechanize.rb, line 1115 1115: def ssl_version= ssl_version 1116: @agent.ssl_version = ssl_version 1117: end
Submits form with an optional button.
Without a button:
page = agent.get('http://example.com') agent.submit(page.forms.first)
With a button:
agent.submit(page.forms.first, page.forms.first.buttons.first)
# File lib/mechanize.rb, line 510 510: def submit(form, button=nil, headers={}) 511: form.add_button_to_query(button) if button 512: 513: case form.method.upcase 514: when 'POST' 515: post_form(form.action, form, headers) 516: when 'GET' 517: get(form.action.gsub(/\?[^\?]*$/, ''), 518: form.build_query, 519: form.page, 520: headers) 521: else 522: raise ArgumentError, "unsupported method: #{form.method.upcase}" 523: end 524: end
Runs given block, then resets the page history as it was before. self is given as a parameter to the block. Returns the value of the block.
# File lib/mechanize.rb, line 530 530: def transact 531: history_backup = @agent.history.dup 532: begin 533: yield self 534: ensure 535: @agent.history = history_backup 536: end 537: end
The identification string for the client initiating a web request
# File lib/mechanize.rb, line 969 969: def user_agent 970: @agent.user_agent 971: end
Sets the User-Agent used by mechanize to user_agent. See also user_agent_alias
# File lib/mechanize.rb, line 977 977: def user_agent= user_agent 978: @agent.user_agent = user_agent 979: end
Set the user agent for the Mechanize object based on the given name.
See also AGENT_ALIASES
# File lib/mechanize.rb, line 986 986: def user_agent_alias= name 987: self.user_agent = AGENT_ALIASES[name] || 988: raise(ArgumentError, "unknown agent alias #{name.inspect}") 989: end
A callback for additional certificate verification. See OpenSSL::SSL::SSLContext#verify_callback
The callback can be used for debugging or to ignore errors by always returning true. Specifying nil uses the default method that was valid when the SSLContext was created
# File lib/mechanize.rb, line 1127 1127: def verify_callback 1128: @agent.verify_callback 1129: end
Sets the OpenSSL certificate verification callback
# File lib/mechanize.rb, line 1134 1134: def verify_callback= verify_callback 1135: @agent.verify_callback = verify_callback 1136: end
the OpenSSL server certificate verification method. The default is OpenSSL::SSL::VERIFY_PEER and certificate verification uses the default system certificates. See also cert_store
# File lib/mechanize.rb, line 1143 1143: def verify_mode 1144: @agent.verify_mode 1145: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.