Array
Once you’ve matched a list of elements, you will often need to handle them as a group. Or you may want to perform the same action on each of them. Hpricot::Elements is an extension of Ruby’s array class, with some methods added for altering elements contained in the array.
If you need to create an element array from regular elements:
Hpricot::Elements[ele1, ele2, ele3]
Assuming that ele1, ele2 and ele3 contain element objects (Hpricot::Elem, Hpricot::Doc, etc.)
Usually the Hpricot::Elements you’re working on comes from a search you’ve done. Well, you can continue searching the list by using the same at and search methods you can use on plain elements.
elements = doc.search("/div/p") elements = elements.search("/a[@href='http://hoodwink.d/']") elements = elements.at("img")
When you’re altering elements in the list, your changes will be reflected in the document you started searching from.
doc = Hpricot("That's my <b>spoon</b>, Tyler.") doc.at("b").swap("<i>fork</i>") doc.to_html #=> "That's my <i>fork</i>, Tyler."
If you can’t find a method here that does what you need, you may need to loop through the elements and find a method in Hpricot::Container::Trav which can do what you need.
For example, you may want to search for all the H3 header tags in a document and grab all the tags underneath the header, but not inside the header. A good method for this is next_sibling:
doc.search("h3").each do |h3| while ele = h3.next_sibling ary << ele # stuff away all the elements under the h3 end end
Most of the useful element methods are in the mixins Hpricot::Traverse and Hpricot::Container::Trav.
Given two elements, attempt to gather an Elements array of everything between (and including) those two elements.
# File lib/hpricot/elements.rb, line 319 319: def self.expand(ele1, ele2, excl=false) 320: ary = [] 321: offset = excl ? 1 : 0 322: 323: if ele1 and ele2 324: # let's quickly take care of siblings 325: if ele1.parent == ele2.parent 326: ary = ele1.parent.children[ele1.node_position..(ele2.node_position+offset)] 327: else 328: # find common parent 329: p, ele1_p = ele1, [ele1] 330: ele1_p.unshift p while p.respond_to?(:parent) and p = p.parent 331: p, ele2_p = ele2, [ele2] 332: ele2_p.unshift p while p.respond_to?(:parent) and p = p.parent 333: common_parent = ele1_p.zip(ele2_p).select { |p1, p2| p1 == p2 }.flatten.last 334: 335: child = nil 336: if ele1 == common_parent 337: child = ele2 338: elsif ele2 == common_parent 339: child = ele1 340: end 341: 342: if child 343: ary = common_parent.children[0..(child.node_position+offset)] 344: end 345: end 346: end 347: 348: return Elements[*ary] 349: end
# File lib/hpricot/elements.rb, line 274 274: def self.filter(nodes, expr, truth = true) 275: until expr.empty? 276: _, *m = *expr.match(/^(?:#{ATTR_RE}|#{BRACK_RE}|#{FUNC_RE}|#{CUST_RE}|#{CATCH_RE})/) 277: break unless _ 278: 279: expr = $' 280: m.compact! 281: if m[0] == '@' 282: m[0] = "@#{m.slice!(2,1).join}" 283: end 284: 285: if m[0] == '[' && m[1] =~ /^\d+$/ 286: m = [":", "nth", m[1].to_i-1] 287: end 288: 289: if m[0] == ":" && m[1] == "not" 290: nodes, = Elements.filter(nodes, m[2], false) 291: elsif "#{m[0]}#{m[1]}" =~ /^(:even|:odd)$/ 292: new_nodes = [] 293: nodes.each_with_index {|n,i| new_nodes.push(n) if (i % 2 == (m[1] == "even" ? 0 : 1)) } 294: nodes = new_nodes 295: elsif "#{m[0]}#{m[1]}" =~ /^(:first|:last)$/ 296: nodes = [nodes.send(m[1])] 297: else 298: meth = "filter[#{m[0]}#{m[1]}]" unless m[0].empty? 299: if meth and Traverse.method_defined? meth 300: args = m[2..1] 301: else 302: meth = "filter[#{m[0]}]" 303: if Traverse.method_defined? meth 304: args = m[1..1] 305: end 306: end 307: args << 1 308: nodes = Elements[*nodes.find_all do |x| 309: args[1] += 1 310: x.send(meth, *args) ? truth : !truth 311: end] 312: end 313: end 314: [nodes, expr] 315: end
Adds the class to all matched elements.
(doc/"p").add_class("bacon")
Now all paragraphs will have class=“bacon”.
# File lib/hpricot/elements.rb, line 226 226: def add_class class_name 227: each do |el| 228: next unless el.respond_to? :get_attribute 229: classes = el.get_attribute('class').to_s.split(" ") 230: el.set_attribute('class', classes.push(class_name).uniq.join(" ")) 231: end 232: self 233: end
Just after each element in this list, add some HTML. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 154 154: def after(str = nil, &blk) 155: each { |x| x.parent.insert_after x.make(str, &blk), x } 156: end
Add to the end of the contents inside each element in this list. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 136 136: def append(str = nil, &blk) 137: each { |x| x.html(x.children + x.make(str, &blk)) } 138: end
Searches this list for the first element (or child of these elements) matching the CSS or XPath expression expr. Root is assumed to be the element scanned.
See Hpricot::Container::Trav.at for more.
# File lib/hpricot/elements.rb, line 67 67: def at(expr, &blk) 68: if expr.kind_of? Fixnum 69: super 70: else 71: search(expr, &blk)[0] 72: end 73: end
Gets and sets attributes on all matched elements.
Pass in a key on its own and this method will return the string value assigned to that attribute for the first elements. Or nil if the attribute isn’t found.
doc.search("a").attr("href") #=> "http://hacketyhack.net/"
Or, pass in a key and value. This will set an attribute for all matched elements.
doc.search("p").attr("class", "basic")
You may also use a Hash to set a series of attributes:
(doc/"a").attr(:class => "basic", :href => "http://hackety.org/")
Lastly, a block can be used to rewrite an attribute based on the element it belongs to. The block will pass in an element. Return from the block the new value of the attribute.
records.attr("href") { |e| e['href'] + "#top" }
This example adds a # anchor to each link.
# File lib/hpricot/elements.rb, line 205 205: def attr key, value = nil, &blk 206: if value or blk 207: each do |el| 208: el.set_attribute(key, value || blk[el]) 209: end 210: return self 211: end 212: if key.is_a? Hash 213: key.each { |k,v| self.attr(k,v) } 214: return self 215: else 216: return self[0].get_attribute(key) 217: end 218: end
Add some HTML just previous to each element in this list. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 148 148: def before(str = nil, &blk) 149: each { |x| x.parent.insert_before x.make(str, &blk), x } 150: end
Empty the elements in this list, by removing their insides.
doc = Hpricot("<p> We have <i>so much</i> to say.</p>") doc.search("i").empty doc.to_html => "<p> We have <i></i> to say.</p>"
# File lib/hpricot/elements.rb, line 130 130: def empty 131: each { |x| x.inner_html = nil } 132: end
# File lib/hpricot/elements.rb, line 351 351: def filter(expr) 352: nodes, = Elements.filter(self, expr) 353: nodes 354: end
Returns an HTML fragment built of the contents of each element in this list.
If a HTML string is supplied, this method acts like inner_html=.
# File lib/hpricot/elements.rb, line 86 86: def inner_html(*string) 87: if string.empty? 88: map { |x| x.inner_html }.join 89: else 90: x = self.inner_html = string.pop || x 91: end 92: end
Replaces the contents of each element in this list. Supply an HTML string, which is loaded into Hpricot objects and inserted into every element in this list.
# File lib/hpricot/elements.rb, line 99 99: def inner_html=(string) 100: each { |x| x.inner_html = string } 101: end
Returns an string containing the text contents of each element in this list. All HTML tags are removed.
# File lib/hpricot/elements.rb, line 107 107: def inner_text 108: map { |x| x.inner_text }.join 109: end
# File lib/hpricot/elements.rb, line 356 356: def not(expr) 357: if expr.is_a? Traverse 358: nodes = self - [expr] 359: else 360: nodes, = Elements.filter(self, expr, false) 361: end 362: nodes 363: end
Add to the start of the contents inside each element in this list. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 142 142: def prepend(str = nil, &blk) 143: each { |x| x.html(x.make(str, &blk) + x.children) } 144: end
Remove all elements in this list from the document which contains them.
doc = Hpricot("<html>Remove this: <b>here</b></html>") doc.search("b").remove doc.to_html => "<html>Remove this: </html>"
# File lib/hpricot/elements.rb, line 119 119: def remove 120: each { |x| x.parent.children.delete(x) } 121: end
Remove an attribute from each of the matched elements.
(doc/"input").remove_attr("disabled")
# File lib/hpricot/elements.rb, line 239 239: def remove_attr name 240: each do |el| 241: next unless el.respond_to? :remove_attribute 242: el.remove_attribute(name) 243: end 244: self 245: end
Removes a class from all matched elements.
(doc/"span").remove_class("lightgrey")
Or, to remove all classes:
(doc/"span").remove_class
# File lib/hpricot/elements.rb, line 255 255: def remove_class name = nil 256: each do |el| 257: next unless el.respond_to? :get_attribute 258: if name 259: classes = el.get_attribute('class').to_s.split(" ") 260: el.set_attribute('class', (classes - [name]).uniq.join(" ")) 261: else 262: el.remove_attribute("class") 263: end 264: end 265: self 266: end
Searches this list for any elements (or children of these elements) matching the CSS or XPath expression expr. Root is assumed to be the element scanned.
See Hpricot::Container::Trav.search for more.
# File lib/hpricot/elements.rb, line 58 58: def search(*expr,&blk) 59: Elements[*map { |x| x.search(*expr,&blk) }.flatten.uniq] 60: end
Convert this group of elements into a complete HTML fragment, returned as a string.
# File lib/hpricot/elements.rb, line 78 78: def to_html 79: map { |x| x.output("") }.join 80: end
Wraps each element in the list inside the element created by HTML str. If more than one element is found in the string, Hpricot locates the deepest spot inside the first element.
doc.search("a[@href]"). wrap(%{<div class="link"><div class="link_inner"></div></div>})
This code wraps every link on the page inside a div.link and a div.link_inner nest.
# File lib/hpricot/elements.rb, line 166 166: def wrap(str = nil, &blk) 167: each do |x| 168: wrap = x.make(str, &blk) 169: nest = wrap.detect { |w| w.respond_to? :children } 170: unless nest 171: raise "No wrapping element found." 172: end 173: x.parent.replace_child(x, wrap) 174: nest = nest.children.first until nest.empty? 175: nest.html([x]) 176: end 177: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.