Parent

Included Modules

CodeRay::Scanners::Scanner

Scanner

The base class for all Scanners.

It is a subclass of Ruby’s great StringScanner, which makes it easy to access the scanning methods inside.

It is also Enumerable, so you can use it like an Array of Tokens:

  require 'coderay'
  
  c_scanner = CodeRay::Scanners[:c].new "if (*p == '{') nest++;"
  
  for text, kind in c_scanner
    puts text if kind == :operator
  end
  
  # prints: (*==)++;

OK, this is a very simple example :) You can also use map, any?, find and even sort_by, if you want.

Constants

ScanError

Raised if a Scanner fails while scanning

DEFAULT_OPTIONS

The default options for all scanner classes.

Define @default_options for subclasses.

KINDS_NOT_LOC

Attributes

state[RW]

Public Class Methods

encoding(name = 'UTF-8') click to toggle source

The encoding used internally by this scanner.

    # File lib/coderay/scanner.rb, line 89
89:         def encoding name = 'UTF-8'
90:           @encoding ||= defined?(Encoding.find) && Encoding.find(name)
91:         end
file_extension(extension = lang) click to toggle source

The typical filename suffix for this scanner’s language.

    # File lib/coderay/scanner.rb, line 84
84:         def file_extension extension = lang
85:           @file_extension ||= extension.to_s
86:         end
lang() click to toggle source

The lang of this Scanner class, which is equal to its Plugin ID.

    # File lib/coderay/scanner.rb, line 94
94:         def lang
95:           @plugin_id
96:         end
new(code = '', options = {}) click to toggle source

Create a new Scanner.

  • code is the input String and is handled by the superclass StringScanner.

  • options is a Hash with Symbols as keys. It is merged with the default options of the class (you can overwrite default options here.)

Else, a Tokens object is used.

     # File lib/coderay/scanner.rb, line 143
143:       def initialize code = '', options = {}
144:         if self.class == Scanner
145:           raise NotImplementedError, "I am only the basic Scanner class. I can't scan anything. :( Use my subclasses."
146:         end
147:         
148:         @options = self.class::DEFAULT_OPTIONS.merge options
149:         
150:         super self.class.normalize(code)
151:         
152:         @tokens = options[:tokens] || Tokens.new
153:         @tokens.scanner = self if @tokens.respond_to? :scanner=
154:         
155:         setup
156:       end
normalize(code) click to toggle source

Normalizes the given code into a string with UNIX newlines, in the scanner’s internal encoding, with invalid and undefined charachters replaced by placeholders. Always returns a new object.

    # File lib/coderay/scanner.rb, line 69
69:         def normalize code
70:           # original = code
71:           code = code.to_s unless code.is_a? ::String
72:           return code if code.empty?
73:           
74:           if code.respond_to? :encoding
75:             code = encode_with_encoding code, self.encoding
76:           else
77:             code = to_unix code
78:           end
79:           # code = code.dup if code.eql? original
80:           code
81:         end

Protected Class Methods

encode_with_encoding(code, target_encoding) click to toggle source
     # File lib/coderay/scanner.rb, line 100
100:         def encode_with_encoding code, target_encoding
101:           if code.encoding == target_encoding
102:             if code.valid_encoding?
103:               return to_unix(code)
104:             else
105:               source_encoding = guess_encoding code
106:             end
107:           else
108:             source_encoding = code.encoding
109:           end
110:           # print "encode_with_encoding from #{source_encoding} to #{target_encoding}"
111:           code.encode target_encoding, source_encoding, :universal_newline => true, :undef => :replace, :invalid => :replace
112:         end
guess_encoding(s) click to toggle source
     # File lib/coderay/scanner.rb, line 118
118:         def guess_encoding s
119:           #:nocov:
120:           IO.popen("file -b --mime -", "w+") do |file|
121:             file.write s[0, 1024]
122:             file.close_write
123:             begin
124:               Encoding.find file.gets[/charset=([-\w]+)/, 1]
125:             rescue ArgumentError
126:               Encoding::BINARY
127:             end
128:           end
129:           #:nocov:
130:         end
to_unix(code) click to toggle source
     # File lib/coderay/scanner.rb, line 114
114:         def to_unix code
115:           code.index(\r\) ? code.gsub(/\r\n?/, "\n") : code
116:         end

Public Instance Methods

binary_string() click to toggle source

The string in binary encoding.

To be used with #, which is the index of the byte the scanner will scan next.

     # File lib/coderay/scanner.rb, line 243
243:       def binary_string
244:         @binary_string ||=
245:           if string.respond_to?(:bytesize) && string.bytesize != string.size
246:             #:nocov:
247:             string.dup.force_encoding('binary')
248:             #:nocov:
249:           else
250:             string
251:           end
252:       end
column(pos = self.pos) click to toggle source

The current column position of the scanner, starting with 1. See also: #.

     # File lib/coderay/scanner.rb, line 234
234:       def column pos = self.pos
235:         return 1 if pos <= 0
236:         pos - (binary_string.rindex(\n\, pos - 1) || 1)
237:       end
each(&block) click to toggle source

Traverse the tokens.

     # File lib/coderay/scanner.rb, line 217
217:       def each &block
218:         tokens.each(&block)
219:       end
file_extension() click to toggle source

the default file extension for this scanner

     # File lib/coderay/scanner.rb, line 178
178:       def file_extension
179:         self.class.file_extension
180:       end
lang() click to toggle source

the Plugin ID for this scanner

     # File lib/coderay/scanner.rb, line 173
173:       def lang
174:         self.class.lang
175:       end
line(pos = self.pos) click to toggle source

The current line position of the scanner, starting with 1. See also: #.

Beware, this is implemented inefficiently. It should be used for debugging only.

     # File lib/coderay/scanner.rb, line 227
227:       def line pos = self.pos
228:         return 1 if pos <= 0
229:         binary_string[0...pos].count("\n") + 1
230:       end
reset() click to toggle source

Sets back the scanner. Subclasses should redefine the reset_instance method instead of this one.

     # File lib/coderay/scanner.rb, line 160
160:       def reset
161:         super
162:         reset_instance
163:       end
string=(code) click to toggle source

Set a new string to be scanned.

     # File lib/coderay/scanner.rb, line 166
166:       def string= code
167:         code = self.class.normalize(code)
168:         super code
169:         reset_instance
170:       end
tokenize(source = nil, options = {}) click to toggle source

Scan the code and returns all tokens in a Tokens object.

     # File lib/coderay/scanner.rb, line 183
183:       def tokenize source = nil, options = {}
184:         options = @options.merge(options)
185:         @tokens = options[:tokens] || @tokens || Tokens.new
186:         @tokens.scanner = self if @tokens.respond_to? :scanner=
187:         case source
188:         when Array
189:           self.string = self.class.normalize(source.join)
190:         when nil
191:           reset
192:         else
193:           self.string = self.class.normalize(source)
194:         end
195:         
196:         begin
197:           scan_tokens @tokens, options
198:         rescue => e
199:           message = "Error in %s#scan_tokens, initial state was: %p" % [self.class, defined?(state) && state]
200:           raise_inspect e.message, @tokens, message, 30, e.backtrace
201:         end
202:         
203:         @cached_tokens = @tokens
204:         if source.is_a? Array
205:           @tokens.split_into_parts(*source.map { |part| part.size })
206:         else
207:           @tokens
208:         end
209:       end
tokens() click to toggle source

Cache the result of tokenize.

     # File lib/coderay/scanner.rb, line 212
212:       def tokens
213:         @cached_tokens ||= tokenize
214:       end

Protected Instance Methods

raise_inspect(msg, tokens, state = self.state || 'No state given!', ambit = 30, backtrace = caller) click to toggle source

Scanner error with additional status information

     # File lib/coderay/scanner.rb, line 281
281:       def raise_inspect msg, tokens, state = self.state || 'No state given!', ambit = 30, backtrace = caller
282:         raise ScanError, ***ERROR in %s: %s (after %d tokens)tokens:%scurrent line: %d  column: %d  pos: %dmatched: %p  state: %pbol? = %p,  eos? = %psurrounding code:%p  ~~  %p***ERROR*** % [
283:           File.basename(caller[0]),
284:           msg,
285:           tokens.respond_to?(:size) ? tokens.size : 0,
286:           tokens.respond_to?(:last) ? tokens.last(10).map { |t| t.inspect }.join("\n") : '',
287:           line, column, pos,
288:           matched, state, bol?, eos?,
289:           binary_string[pos - ambit, ambit],
290:           binary_string[pos, ambit],
291:         ], backtrace
292:       end
reset_instance() click to toggle source

Resets the scanner.

     # File lib/coderay/scanner.rb, line 274
274:       def reset_instance
275:         @tokens.clear if @tokens.respond_to?(:clear) && !@options[:keep_tokens]
276:         @cached_tokens = nil
277:         @binary_string = nil if defined? @binary_string
278:       end
scan_rest() click to toggle source

Shorthand for scan_until(/z/). This method also avoids a JRuby 1.9 mode bug.

     # File lib/coderay/scanner.rb, line 314
314:       def scan_rest
315:         rest = self.rest
316:         terminate
317:         rest
318:       end
scan_tokens(tokens, options) click to toggle source

This is the central method, and commonly the only one a subclass implements.

Subclasses must implement this method; it must return tokens and must only use Tokens#<< for storing scanned tokens!

     # File lib/coderay/scanner.rb, line 269
269:       def scan_tokens tokens, options  # :doc:
270:         raise NotImplementedError, "#{self.class}#scan_tokens not implemented."
271:       end
setup() click to toggle source

Can be implemented by subclasses to do some initialization that has to be done once per instance.

Use reset for initialization that has to be done once per scan.

     # File lib/coderay/scanner.rb, line 261
261:       def setup  # :doc:
262:       end

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.