Contains Unicode codepoints, loading as needed from YAML files
Returns string with its UTF-8 characters transliterated to ASCII ones
You’re probably better off just using the added String#to_ascii
# File lib/stringex/unidecoder.rb, line 16 16: def decode(string) 17: string.gsub(/[^\x00-\x7f]/) do |codepoint| 18: if localized = local_codepoint(codepoint) 19: localized 20: else 21: begin 22: unpacked = codepoint.unpack("U")[0] 23: CODEPOINTS[code_group(unpacked)][grouped_point(unpacked)] 24: rescue 25: # Hopefully this won't come up much 26: # TODO: Make this note something to the user that is reportable to me perhaps 27: "?" 28: end 29: end 30: end 31: end
Returns default locale for localized transliterations. NOTE: Will set @locale as well.
# File lib/stringex/unidecoder.rb, line 72 72: def default_locale 73: @default_locale ||= "en" 74: @locale = @default_locale 75: end
Sets the default locale for localized transliterations. NOTE: Will set @locale as well.
# File lib/stringex/unidecoder.rb, line 78 78: def default_locale=(new_locale) 79: @default_locale = new_locale 80: # Seems logical that @locale should be the new default 81: @locale = new_locale 82: end
Returns character for the given Unicode codepoint
# File lib/stringex/unidecoder.rb, line 34 34: def encode(codepoint) 35: ["0x#{codepoint}".to_i(16)].pack("U") 36: end
Returns string indicating which file (and line) contains the transliteration value for the character
# File lib/stringex/unidecoder.rb, line 40 40: def in_yaml_file(character) 41: unpacked = character.unpack("U")[0] 42: "#{code_group(unpacked)}.yml (line #{grouped_point(unpacked) + 2})" 43: end
Returns the localized transliteration for a codepoint
# File lib/stringex/unidecoder.rb, line 85 85: def local_codepoint(codepoint) 86: locale_hash = LOCAL_CODEPOINTS[locale] || LOCAL_CODEPOINTS[locale.is_a?(Symbol) ? locale.to_s : locale.to_sym] 87: locale_hash && locale_hash[codepoint] 88: end
Returns locale for localized transliterations
# File lib/stringex/unidecoder.rb, line 56 56: def locale 57: if @locale 58: @locale 59: elsif defined?(I18n) 60: I18n.locale 61: else 62: default_locale 63: end 64: end
Sets locale for localized transliterations
# File lib/stringex/unidecoder.rb, line 67 67: def locale=(new_locale) 68: @locale = new_locale 69: end
Adds localized transliterations to Unidecoder
# File lib/stringex/unidecoder.rb, line 46 46: def localize_from(hash_or_path_to_file) 47: hash = if hash_or_path_to_file.is_a?(Hash) 48: hash_or_path_to_file 49: else 50: YAML.load_file(hash_or_path_to_file) 51: end 52: verify_local_codepoints hash 53: end
Runs a block with default locale
# File lib/stringex/unidecoder.rb, line 100 100: def with_default_locale(&block) 101: with_locale default_locale, &block 102: end
Runs a block with a temporary locale setting, returning the locale to the original state when complete
# File lib/stringex/unidecoder.rb, line 91 91: def with_locale(new_locale, &block) 92: new_locale = default_locale if new_locale == :default 93: original_locale = locale 94: self.locale = new_locale 95: block.call 96: self.locale = original_locale 97: end
Returns the Unicode codepoint grouping for the given character
# File lib/stringex/unidecoder.rb, line 106 106: def code_group(unpacked_character) 107: "x%02x" % (unpacked_character >> 8) 108: end
Returns the index of the given character in the YAML file for its codepoint group
# File lib/stringex/unidecoder.rb, line 111 111: def grouped_point(unpacked_character) 112: unpacked_character & 255 113: end
Checks LOCAL_CODEPOINTS’s Hash is in the format we expect before assigning it and raises instructive exception if not
# File lib/stringex/unidecoder.rb, line 117 117: def verify_local_codepoints(hash) 118: pass_check = hash.is_a?(Hash) && hash.all?{|key, value| 119: # Fuck a duck, eh? 120: [Symbol, String].include?(key.class) && value.is_a?(Hash) && 121: value.keys.all?{|k| k.is_a?(String)} && value.values.all?{|v| v.is_a?(String)} 122: } 123: if pass_check 124: hash.each do |k, v| 125: LOCAL_CODEPOINTS[k] = v 126: end 127: else 128: raise ArgumentError, "LOCAL_CODEPOINTS is not correctly defined. Please see the README for more information on how to correctly format this data." 129: end 130: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.