Copy a Webpage to JSON
Bookmarklets have been around for a long time. The concept is to attached some Javascript code in a bookmark. Over the years this process has changed. The tricky bit is to copy the link into an existing bookmark. So, for example, you can make a bookmark for this page, then in your browser edit the bookmark with a name like Convert a Webpage to JSON
. Then copy the address of the link below called Convert this page to JSON
and paste it into the Bookmark link. After running this bookmarklet inspect the result using JSONpath or similar online JSON parser. If you use JSONpath with the output from the link below, try a path like $..MAIN..UL[*].LI
.
Here is a list of useful features for this utility:
- The resulting JSON string is put in the cut buffer.
- It should handle “ and ‘ properly.
- It would be nice if sections that are not visible on the page, are not included in the JSON.
- skip some sections, like: SCRIPT STYLE SVG.
This code focuses on content, not layout. The only attribute that is preserved is id
. The resulting JSON has HTML elements in uppercase and id
values are in lowercase. It works by recursively descending the DOM.
Col 1 | Col 2 |
---|---|
Row 1, Col 1 | Row 1, Col 2 |
Row 2, Col 1 | Row 2, Col 2 |