Stringex::StringExtensions

These methods are all added on String class.

Public Instance Methods

collapse(character = " ") click to toggle source

Removes specified character from the beginning and/or end of the string and then performs String#squeeze(character), condensing runs of the character within the string.

Note: This method has been superceded by ActiveSupport’s squish method.

     # File lib/stringex/string_extensions.rb, line 212
212:     def collapse(character = " ")
213:       sub(/^#{character}*/, "").sub(/#{character}*$/, "").squeeze(character)
214:     end
convert_accented_entities() click to toggle source

Converts HTML entities into the respective non-accented letters. Examples:

  "á".convert_accented_entities # => "a"
  "ç".convert_accented_entities # => "c"
  "è".convert_accented_entities # => "e"
  "î".convert_accented_entities # => "i"
  "ø".convert_accented_entities # => "o"
  "ü".convert_accented_entities # => "u"

Note: This does not do any conversion of Unicode/ASCII accented-characters. For that functionality please use to_ascii.

    # File lib/stringex/string_extensions.rb, line 73
73:     def convert_accented_entities
74:       gsub(/&([A-Za-z])(grave|acute|circ|tilde|uml|ring|cedil|slash);/, '\1').strip
75:     end
convert_misc_characters(options = {}) click to toggle source

Converts various common plaintext characters to a more URI-friendly representation. Examples:

  "foo & bar".convert_misc_characters # => "foo and bar"
  "Chanel #9".convert_misc_characters # => "Chanel number nine"
  "user@host".convert_misc_characters # => "user at host"
  "google.com".convert_misc_characters # => "google dot com"
  "$10".convert_misc_characters # => "10 dollars"
  "*69".convert_misc_characters # => "star 69"
  "100%".convert_misc_characters # => "100 percent"
  "windows/mac/linux".convert_misc_characters # => "windows slash mac slash linux"

Note: Because this method will convert any & symbols to the string “and”, you should run any methods which convert HTML entities (convert_html_entities and convert_misc_entities) before running this method.

     # File lib/stringex/string_extensions.rb, line 165
165:     def convert_misc_characters(options = {})
166:       dummy = dup.gsub(/\.{3,}/, " dot dot dot ") # Catch ellipses before single dot rule!
167:       # Special rules for money
168:       {
169:         /(\s|^)\$(\d+)\.(\d+)(\s|$)/ => '\2 dollars \3 cents',
170:         /(\s|^)£(\d+)\.(\d+)(\s|$)/ => '\2 pounds \3 pence',
171:       }.each do |found, replaced|
172:         replaced = " #{replaced} " unless replaced =~ /\\1/
173:         dummy.gsub!(found, replaced)
174:       end
175:       # Back to normal rules
176:       misc_characters = 
177:       {
178:         /\s*&\s*/ => "and",
179:         /\s*#/ => "number",
180:         /\s*@\s*/ => "at",
181:         /(\S|^)\.(\S)/ => '\1 dot \2',
182:         /(\s|^)\$(\d*)(\s|$)/ => '\2 dollars',
183:         /(\s|^)£(\d*)(\s|$)/ => '\2 pounds',
184:         /(\s|^)¥(\d*)(\s|$)/ => '\2 yen',
185:         /\s*\*\s*/ => "star",
186:         /\s*%\s*/ => "percent",
187:         /(\s*=\s*)/ => " equals ",
188:         /\s*\+\s*/ => "plus",
189:         /\s*°\s*/ => "degrees"
190:       }
191:       misc_characters[/\s*(\\|\/)\s*/] = 'slash' unless options[:allow_slash]
192:       misc_characters.each do |found, replaced|
193:         replaced = " #{replaced} " unless replaced =~ /\\1/
194:         dummy.gsub!(found, replaced)
195:       end
196:       dummy = dummy.gsub(/(^|[[:alpha:]])'([[:alpha:]]|$)/, '\1\2').gsub(/[\.,:;()\[\]\/\?!\^'ʼ"_]/, " ").strip
197:     end
convert_misc_entities() click to toggle source

Converts HTML entities (taken from common Textile/RedCloth formattings) into plain text formats.

Note: This isn’t an attempt at complete conversion of HTML entities, just those most likely to be generated by Textile.

     # File lib/stringex/string_extensions.rb, line 81
 81:     def convert_misc_entities
 82:       dummy = dup
 83:       {
 84:         "#822[01]" => "\"",
 85:         "#821[67]" => "'",
 86:         "#8230" => "...",
 87:         "#8211" => "-",
 88:         "#8212" => "--",
 89:         "#215" => "x",
 90:         "gt" => ">",
 91:         "lt" => "<",
 92:         "(#8482|trade)" => "(tm)",
 93:         "(#174|reg)" => "(r)",
 94:         "(#169|copy)" => "(c)",
 95:         "(#38|amp)" => "and",
 96:         "nbsp" => " ",
 97:         "(#162|cent)" => " cent",
 98:         "(#163|pound)" => " pound",
 99:         "(#188|frac14)" => "one fourth",
100:         "(#189|frac12)" => "half",
101:         "(#190|frac34)" => "three fourths",
102:         "(#176|deg)" => " degrees "
103:       }.each do |textiled, normal|
104:         dummy.gsub!(/&#{textiled};/, normal)
105:       end
106:       dummy.gsub(/&[^;]+;/, "").strip
107:     end
convert_smart_punctuation() click to toggle source

Converts MS Word ‘smart punctuation’ to ASCII

     # File lib/stringex/string_extensions.rb, line 137
137:     def convert_smart_punctuation
138:       dummy = dup
139:       {
140: 
141:         "(“|”|\3302\2223|\3302\2224|\3303\2222|\3303\2223)" => '"',
142:         "(‘|’|\3302\2221|\3302\2222|\3303\2225)" => "'",
143:         "…" => "...",
144:       }.each do |smart, normal|
145:         dummy.gsub!(/#{smart}/, normal)
146:       end
147:       dummy.strip
148:     end
convert_vulgar_fractions() click to toggle source

Converts vulgar fractions from supported html entities and unicode to plain text formats.

     # File lib/stringex/string_extensions.rb, line 111
111:     def convert_vulgar_fractions
112:       dummy = dup
113:       {
114:         "(&#188;|&frac14;|¼)" => "one fourth",
115:         "(&#189;|&frac12;|½)" => "half",
116:         "(&#190;|&frac34;|¾)" => "three fourths",
117:         "(&#8531;|⅓)" => "one third",
118:         "(&#8532;|⅔)" => "two thirds",
119:         "(&#8533;|⅕)" => "one fifth",
120:         "(&#8534;|⅖)" => "two fifths",
121:         "(&#8535;|⅗)" => "three fifths",
122:         "(&#8536;|⅘)" => "four fifths",
123:         "(&#8537;|⅙)" => "one sixth",
124:         "(&#8538;|⅚)" => "five sixths",
125:         "(&#8539;|⅛)" => "one eighth",
126:         "(&#8540;|⅜)" => "three eighths",
127:         "(&#8541;|⅝)" => "five eighths",
128:         "(&#8542;|⅞)" => "seven eighths"
129:       }.each do |textiled, normal|
130:         dummy.gsub!(/#{textiled}/, normal)
131:       end
132:       dummy
133:     end
limit(lim = nil) click to toggle source
    # File lib/stringex/string_extensions.rb, line 42
42:     def limit(lim = nil)
43:       lim.nil? ? self : self[0...lim] 
44:     end
remove_formatting(options = {}) click to toggle source

Performs multiple text manipulations. Essentially a shortcut for typing them all. View source below to see which methods are run.

    # File lib/stringex/string_extensions.rb, line 48
48:     def remove_formatting(options = {})
49:       strip_html_tags.convert_smart_punctuation.convert_accented_entities.convert_vulgar_fractions.convert_misc_entities.convert_misc_characters(options).to_ascii.collapse
50:     end
replace_whitespace(replace = " ") click to toggle source

Replace runs of whitespace in string. Defaults to a single space but any replacement string may be specified as an argument. Examples:

  "Foo       bar".replace_whitespace # => "Foo bar"
  "Foo       bar".replace_whitespace("-") # => "Foo-bar"
     # File lib/stringex/string_extensions.rb, line 204
204:     def replace_whitespace(replace = " ")
205:       gsub(/\s+/, replace)
206:     end
strip_html_tags(leave_whitespace = false) click to toggle source

Removes HTML tags from text. This code is simplified from Tobias Luettke’s regular expression in Typo.

    # File lib/stringex/string_extensions.rb, line 54
54:     def strip_html_tags(leave_whitespace = false)
55:       name = /[\w:_-]+/
56:       value = /([A-Za-z0-9]+|('[^']*?'|"[^"]*?"))/
57:       attr = /(#{name}(\s*=\s*#{value})?)/
58:       rx = /<[!\/?\[]?(#{name}|--)(\s+(#{attr}(\s+#{attr})*))?\s*([!\/?\]]+|--)?>/
59:       (leave_whitespace) ?  gsub(rx, "").strip : gsub(rx, "").gsub(/\s+/, " ").strip
60:     end
to_ascii() click to toggle source

Returns string with its UTF-8 characters transliterated to ASCII ones. Example:

  "⠋⠗⠁⠝⠉⠑".to_ascii #=> "braille"
     # File lib/stringex/unidecoder.rb, line 155
155:     def to_ascii
156:       Stringex::Unidecoder.decode(self)
157:     end
to_html(lite_mode = false) click to toggle source

Returns the string converted (via Textile/RedCloth) to HTML format or self [with a friendly warning] if Redcloth is not available.

Using :lite argument will cause RedCloth to not wrap the HTML in a container P element, which is useful behavior for generating header element text, etc. This is roughly equivalent to ActionView’s textilize_without_paragraph except that it makes RedCloth do all the work instead of just gsubbing the return from RedCloth.

    # File lib/stringex/string_extensions.rb, line 18
18:     def to_html(lite_mode = false)
19:       if defined?(RedCloth)
20:         if lite_mode
21:           RedCloth.new(self, [:lite_mode]).to_html
22:         else
23:           if self =~ /<pre>/
24:             RedCloth.new(self).to_html.tr("\t", "")
25:           else
26:             RedCloth.new(self).to_html.tr("\t", "").gsub(/\n\n/, "")
27:           end
28:         end
29:       else
30:         warn "String#to_html was called without RedCloth being successfully required"
31:         self
32:       end
33:     end
to_url(options = {}) click to toggle source

Create a URI-friendly representation of the string. This is used internally by acts_as_url but can be called manually in order to generate an URI-friendly version of any string.

    # File lib/stringex/string_extensions.rb, line 38
38:     def to_url(options = {})
39:       remove_formatting(options).downcase.replace_whitespace("-").collapse("-").limit(options[:limit])
40:     end

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.