Parent

Mechanize::PluggableParser

Mechanize allows different parsers for different content types. Mechanize uses PluggableParser to determine which parser to use for any content type. To use your own parser or to change the default parsers, register them with this class through Mechanize#pluggable_parser.

The default parser for unregistered content types is Mechanize::File.

The module Mechanize::Parser provides basic functionality for any content type, so you may use it in custom parsers you write. For small files you wish to perform in-memory operations on, you should subclass Mechanize::File. For large files you should subclass Mechanize::Download as the content is only loaded into memory in small chunks.

When writing your own pluggable parser, be sure to provide a method # that returns a String containing the response body for compatibility with Mechanize#get_file.

Example

To create your own parser, just create a class that takes four parameters in the constructor. Here is an example of registering a parser that handles CSV files:

  require 'csv'

  class CSVParser < Mechanize::File
    attr_reader :csv

    def initialize uri = nil, response = nil, body = nil, code = nil
      super uri, response, body, code
      @csv = CSV.parse body
    end
  end

  agent = Mechanize.new
  agent.pluggable_parser.csv = CSVParser
  agent.get('http://example.com/test.csv')  # => CSVParser

Now any response with a content type of ‘text/csv’ will initialize a CSVParser and return that object to the caller.

To register a parser for a content type that Mechanize does not know about, use the hash syntax:

  agent.pluggable_parser['text/something'] = SomeClass

To set the default parser, use #:

  agent.pluggable_parser.default = Mechanize::Download

Now all unknown content types will be saved to disk and not loaded into memory.

Constants

CONTENT_TYPES

Attributes

default[RW]

Public Class Methods

new() click to toggle source
    # File lib/mechanize/pluggable_parsers.rb, line 72
72:   def initialize
73:     @parsers = {
74:       CONTENT_TYPES[:html]  => Mechanize::Page,
75:       CONTENT_TYPES[:xhtml] => Mechanize::Page,
76:       CONTENT_TYPES[:wap]   => Mechanize::Page,
77:       'image'               => Mechanize::Image
78:     }
79: 
80:     @default = Mechanize::File
81:   end

Public Instance Methods

[](content_type) click to toggle source

Retrieves the parser for content_type content

     # File lib/mechanize/pluggable_parsers.rb, line 147
147:   def [](content_type)
148:     @parsers[content_type]
149:   end
[]=(content_type, klass) click to toggle source

Sets the parser for content_type content to klass

The content_type may either be a full MIME type a simplified MIME type (‘text/x-csv’ simplifies to ‘text/csv’) or a media type like ‘image’.

     # File lib/mechanize/pluggable_parsers.rb, line 157
157:   def []= content_type, klass
158:     register_parser content_type, klass
159:   end
csv=(klass) click to toggle source

Registers klass as the parser for text/csv content

     # File lib/mechanize/pluggable_parsers.rb, line 133
133:   def csv=(klass)
134:     register_parser(CONTENT_TYPES[:csv], klass)
135:   end
html=(klass) click to toggle source

Registers klass as the parser for text/html and application/xhtml+xml content

     # File lib/mechanize/pluggable_parsers.rb, line 111
111:   def html=(klass)
112:     register_parser(CONTENT_TYPES[:html], klass)
113:     register_parser(CONTENT_TYPES[:xhtml], klass)
114:   end
parser(content_type) click to toggle source

Returns the parser registered for the given content_type

     # File lib/mechanize/pluggable_parsers.rb, line 86
 86:   def parser content_type
 87:     return default unless content_type
 88: 
 89:     parser = @parsers[content_type]
 90: 
 91:     return parser if parser
 92: 
 93:     mime_type = MIME::Type.new content_type
 94: 
 95:     parser = @parsers[mime_type.to_s] ||
 96:              @parsers[mime_type.simplified] ||
 97:              @parsers[mime_type.media_type] ||
 98:              default
 99:   rescue MIME::InvalidContentType
100:     default
101:   end
pdf=(klass) click to toggle source

Registers klass as the parser for application/pdf content

     # File lib/mechanize/pluggable_parsers.rb, line 126
126:   def pdf=(klass)
127:     register_parser(CONTENT_TYPES[:pdf], klass)
128:   end
xhtml=(klass) click to toggle source

Registers klass as the parser for application/xhtml+xml content

     # File lib/mechanize/pluggable_parsers.rb, line 119
119:   def xhtml=(klass)
120:     register_parser(CONTENT_TYPES[:xhtml], klass)
121:   end
xml=(klass) click to toggle source

Registers klass as the parser for text/xml content

     # File lib/mechanize/pluggable_parsers.rb, line 140
140:   def xml=(klass)
141:     register_parser(CONTENT_TYPES[:xml], klass)
142:   end

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.