Search code examples
htmllinuxbrowserclipboardxorg

getting HTML source or rich text from the X clipboard


How can rich text or HTML source code be obtained from the X clipboard? For example, if you copy some text from a web browser and paste it into kompozer, it pastes as HTML, with links etc. preserved. However, xclip -o for the same selection just outputs plain text, reformatted in a way similar to that of elinks -dump. I'd like to pull the HTML out and into a text editor (specifically vim).

I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn't get any informative responses. The X clipboard API is to me yet a mysterious beast; any tips on hacking something up to pull this information are most welcome. My language of choice these days is Python, but pretty much anything is okay.


Solution

  • In X11 you have to communicate with the selection owner, ask about supported formats, and then request data in the specific format. I think the easiest way to do this is using existing windowing toolkits. E,g. with Python and GTK:

    #!/usr/bin/python
    
    import glib, gtk
    
    def test_clipboard():
        clipboard = gtk.Clipboard()
        targets = clipboard.wait_for_targets()
        print "Targets available:", ", ".join(map(str, targets))
        for target in targets:
            print "Trying '%s'..." % str(target)
            contents = clipboard.wait_for_contents(target)
            if contents:
                print contents.data
    
    def main():
        mainloop = glib.MainLoop()
        def cb():
            test_clipboard()
            mainloop.quit()
        glib.idle_add(cb)
        mainloop.run()
    
    if __name__ == "__main__":
        main()
    

    Output will look like this:

    $ ./clipboard.py 
    Targets available: TIMESTAMP, TARGETS, MULTIPLE, text/html, text/_moz_htmlcontext, text/_moz_htmlinfo, UTF8_STRING, COMPOUND_TEXT, TEXT, STRING, text/x-moz-url-priv
    ...
    Trying 'text/html'...
    I asked <a href="http://superuser.com/questions/144185/getting-html-source-or-rich-text-from-the-x-clipboard">the same question on superuser.com</a>, because I was hoping there was a utility to do this, but I didn't get any informative responses.
    Trying 'text/_moz_htmlcontext'...
    <html><body class="question-page"><div class="container"><div id="content"><div id="mainbar"><div id="question"><table><tbody><tr><td class="postcell"><div><div class="post-text"><p></p></div></div></td></tr></tbody></table></div></div></div></div></body></html>
    ...
    Trying 'STRING'...
    I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn't get any informative responses.
    Trying 'text/x-moz-url-priv'...
    http://stackoverflow.com/questions/3261379/getting-html-source-or-rich-text-from-the-x-clipboard