Skip to content
Thoughtful, detailed coverage of everything Apple for 34 years
and the TidBITS Content Network for Apple professionals
18 comments

Copytables Simplifies Extracting Tabular Data from Web Pages

On the Web, tables are everywhere—you may not even realize how many sites rely on tables behind the scenes for their formatting. Useful as they are for aligning content and displaying columnar data, tables can cause significant frustration if you need to extract data from them. I find myself wanting to do this quite often now that I have the open-source Copytables. I use the free Copytables Chrome extension in Arc, Brave, and Google Chrome, but you can also download Copytables for Firefox (forked from the Chrome version) and Copytables for Safari ($2.99 on the Mac App Store). I haven’t tested those.

Here’s an example of what Copytables makes possible. Earlier this week, I wanted to email someone a list of opportunities from the volunteer-management tool Helper Helper. To extract the text from the table in the Web app’s interface without Copytables, I would have to select the entire table (below left), copy it, and paste it into BBEdit (below right). The results in BBEdit aren’t terrible compared to some tables I’ve seen, but I’d still need to delete every other line. That would be doable with such a small table, but what if there were hundreds of lines or the data didn’t break cleanly at line breaks?

Selecting a table without Copytables

With Copytables in Arc, I instead pressed the Option key to enter cell-selecting mode and dragged over the cells in the leftmost column to select just them (below left). When I copied and pasted into BBEdit, I got exactly what I wanted (below right).

Selecting cells with Copytables

Another example. I’ve been in a pitched battle with spambots for the last few weeks, and one of my most successful interim defenses has been blocking IP ranges. Using Copytables, I can extract hundreds of IP addresses quickly from the logs of our WordPress security plug-ins. To select the contents of a column, I press Command-Option and click the column header. Then it’s trivial to copy the data into BBEdit for manipulation. You can even make discontinuous selections—cells that aren’t next to each other.

Selecting a column with Copytables

Copytables also enables you to select rows or entire tables. Those features aren’t as commonly used—they don’t even get default keyboard shortcuts—so whenever I need them, I open the Copytables window (from the Extensions menu in Arc, or by clicking a pinned extension toolbar icon in Brave or Google Chrome), click Rows or Tables to enable the associated capture mode, and then click to select. If you do use the Capture buttons, make sure to disable them when you’re done, or certain Web apps won’t work correctly due to Copytables capturing their clicks. To select entire tables more quickly, click the Previous Table or Next Table buttons in the Find row. 

Copytables window

The Copy options at the top require more explanation. I haven’t needed them, but they offer various formats for the copied data, some of which could be handy (I’m particularly taken with the options that swap columns and rows). Theoretically, you can set one of these options as the default to use with Edit > Copy, but that didn’t work in my quick testing. Stick to the buttons in the Copytables interface.

  • As is: Copy the table as seen on the screen
  • Plain Table: Copy the table without formatting
  • Text: Copy as tab-delimited text
  • Text+Swap: Copy as tab-delimited text, swap columns and rows
  • CSV: Copy as comma-separated text
  • CSV+Swap: Copy as comma-separated text, swap columns and rows
  • HTML+CSS: Copy as HTML source with formatting
  • HTML: Copy as HTML source without formatting
  • Textile: Copy as Textile (text content)
  • Textile+HTML: Copy as Textile (HTML content)

Copytables has one other clever feature I occasionally find handy: the infobox. It’s an inset box that shows information about your current selection. Consider this table of data about Canadian wildfires from 2000–2021. When I select the contents of the Area Burned column, Copytables displays the blue infobox at the top that counts the number of selected cells, calculates the sum and average, and calls out the min and max. These simple calculations can preclude the need to move data to a spreadsheet.

Showing the Copytables infobox

If the infobox gets in your way, you can turn it off or have Copytables display it in a different corner of the window. To access this and other settings, click the Options link in the Copytables window. The most useful settings are the modifier keys for click-and-drag selection (below). You’ll want to adjust these if they conflict with something else on your system. The Copytables window also has a Keyboard Shortcuts link that provides a browser-wide approach to setting keyboard shortcuts for extensions; the Copytables options match the Find and Capture buttons in its window.

Copytables modifier keys

Copytables is free, but if you find it useful, you can join me in donating to the author, Georg Barikin.

Subscribe today so you don’t miss any TidBITS articles!

Every week you’ll get tech tips, in-depth reviews, and insightful news analysis for discerning Apple users. For over 33 years, we’ve published professional, member-supported tech journalism that makes you smarter.

Registration confirmation will be emailed to you.

This site is protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Comments About Copytables Simplifies Extracting Tabular Data from Web Pages

Notable Replies

  1. I’m a learning technologist at a university where we use Blackboard Learn as the virtual learning environment (VLE) and it’s not uncommon that I have to grab large swathes of data from it. Not unlike you, I’ve often done a cmd-A, cmd-C and cmd-V into BBEdit before going through the rigmarole of excluding the bits I don’t need, before moving the data over to Excel. I’m pretty adept at using BBEdit to whittle it down, but…

    This seems pretty awesome! I’ve just had a quick play with it on tables that extend to a 1000 rows. (A kilotable?) It dealt with selecting the first few columns from a wider table without breaking a sweat.

    One observation - it initially seemed more miss than hit for making its selection. And then I twigged - at least in the case of these Blackboard tables, you seem to have to start your initial click in the upper left corner of the first table cell. Clicking in the centre of the top cell (typically over the text in the column header) doesn’t seem to work.

    Ooh - awesome+ You can select discontiguous columns.

  2. I’ve noticed that it can be a little finicky about where you have to click too, but I’ve chalked that up to tables being formatted in odd ways.

    And yes, discontinuous columns! Forgot to mention that—I’ll add it.

  3. Not a biggie, but it isn’t free. $2.99 in the App Store for Safari.

  4. Oops! Updating—the Chrome extension is free.

  5. Does it work with the web interface to Gmail?

    In other words, if you use it to copy a table from a web page, can the table be pasted into a Gmail message (within a browser window)?

  6. Copying and pasting directly from an html table to gmail (at least with Mailplane) does not give you what you are probably looking for. However, paste into BBEdit and then copy/paste from there works fine. Columns are separated by tabs.

    Interestingly enough, copying from html and pasting into TextEdit produces something that looks more like a table.

    And then copying that and pasting it into Gmail produces the identical format. Might be more like what you are seeking.

  7. Thanks, @hartley. I’m not getting the same results in TextEdit, but I’m not entirely surprised—clipboard formats are a black box, and I’m probably doing something slightly differently.

    What worked for me was to run the HTML code through HTML Cleaner, a Web utility I found long ago, and then to copy the rendered HTML again. That put formatted HTML on the clipboard for pasting into Mimestream, TextEdit, and yes, Gmail’s Web interface.

  8. Thanks, Adam! I have shelled out for the Safari version and it’s working well for me. Regular expressions in BBEdit will clean up anything given enough patience (I use BBEdit’s Text Filters for frequent jobs), but Copytables is easier.

  9. Thanks @ace, what a fantastic extension - I use it with Safari and works great with HTML tables so far. It spares me from wrangling the formatting and such - a great productivity and quality-of-life improvement.

  10. I’m once again displaying my acumen for not understanding even the Dummies for Dummies series books. I bought and downloaded it and installed the extension. Great–it selected the columns or the cells I clicked in (option or command key). But Command C copied nothing. Went to Edit menu, all options are grayed except Select All. I saw a comment in the Review on the app store, “I only wish it would use the regular command C to copy the selection, though.” My thoughts exactly but I can’t seem to see how to copy it using ANY command or menu option. Nor do I see the “Popup Menu” referred to by the creator for copying in various formats. Please take pity on me and be kind with the “You Dummy!” responses I know some are probably formulating. I cry easily. Thanks for any advice on how to do more than just select. I’d like to copy and paste as well.

  11. Nevermind. I knew as soon as I broadcast the fool that I am, the answer would appear. The little icon that shows up next to the URL in safari seems to provide the answer. Apologies for wasting your time.

  12. I find it a bit erratic depending on the table and the web site. Some it will work fine with, others it is just like not using it.

  13. It is not clear to me how you copy an entire table that spans several web pages. For example Strava segment results can span several pages with each page containing 25 results. I can copy the bit of table displayed on the single web page but how do I get the entire table without having to do each individual page?

  14. Wow, this takes me back to 1984 and many long hours studying the Inside Mac volumes. A long forgotten item in IM concerned the clipboard. IM described THREE types of basic content there: text, pictures and (ta-da!) tables! However, this was not actually implemented in the API. There was an Apple-run discussion among some Mac developers called the Table Group that tried to hammer out specs but it rapidly became unwieldy and overcomplicated and nothing became of it. I think Apple sent us a certificate or some such in thanks.

  15. blm

    I use an extension called Table Capture. It’s not free, but it handles multi-page and scrolling tables. You have to scroll or page through them yourself, but as you’re doing so, Table Capture is grabbing them. It can also export to Excel or Google Sheets (as well as just copying the table). As I said, it’s not free (it’s $12/year), but I copy tables, particularly scrolling tables, enough that it’s worth it to me.

  16. Paradoxically, the paid-for Safari extension is significantly less capable than the free plugins for other browsers, presumably thanks to App Store hoop-jumping. It’s still useful, but the paste formats are limited to Formatted (= As is), CSV regular and “transposed” (= Swap), HTML, and Markdown; there are no markup-free plain-text options at all, so some kind of regex cleanup is needed in most everyday use cases. There are also no Find or Capture options in the Copytables window.

  17. That may be due to truly wacky HTML that it can’t parse. I’ve not had problems, but my suggestion would be to try alternative selections (just columns instead of the entire table, for instance).

  18. @ace Having written a web page table parser I can assure you that wacky HTML is the norm rather than the exception. Good ol’ simple html tables are a comparative breeze because they’re a simple structure (cough) but modern css-driven tables are a nightmare because you can randomly place cells visually that have nothing to do with the normal left-right; top-down expectation. (I gave up.)

    I noticed mention of clipboards somewhere in here and the modern clipboard is nothing like the old one. You think “a” clipboard, the reality is 6, 8, 10 or more different expressions of the same stuff. It wouldn’t surprise me to discover an EBCDIC clipboard at at some point. In fact, the clipboards are a pretty miraculous piece of engineering. Complicated translation on the fly. . . .

    As far as grabbing tables go, I’ve found copying a web page (not necessarily a table) into Pages has pretty amazing results—accuracy and non-hair-pulling wise. I’m not sure why TextEdit results and Pages results are different (given the probably identical engines) but they are. You then copy that out of Pages into Bbedit and away you go. . . . :slight_smile:

    Dave

Join the discussion in the TidBITS Discourse forum

Participants

Avatar for ace Avatar for hartley Avatar for charles1 Avatar for nello Avatar for gmccurdy8 Avatar for chengengaun Avatar for davidmorrison Avatar for blm Avatar for kevin5 Avatar for Dafuki Avatar for GaryS Avatar for NickLowe Avatar for oconbach