It is not very difficult once you know how:
- highlight and copy ctrl-c your web table
- open your spreadsheet, paste ctl-v into the spreadsheet
- you may have headers you want to delete, remove the whole row
- you can also delete any columns, remove the whole column
- to remove images: home > find & select > go to special > objects. All images will be selected, delete
- then you can manipulate your spreadsheet
- save your spreadsheet
To highlight an area, click on the top left box, manipulate to the bottom right box, hit shift enter
This is a preview of
Importing Table Data from the Web to a Spreadsheet
.
Read the full post (111 words, 0 images, estimated 27 secs reading time)
When someone, such as a person or a bot, the requester, requests a resource from your server, this request, for Apache, is logged in the raw access log. The requester also leaves some information about itself called http request headers. While not standard to log on Apache, with a little bit of php added to the html, this extra information can be logged and examined to help determine if the requester is a bot or human.
As an additional file will be created daily, I opted to put these files into a subdirectory. The headers, one per line, are being logged into a headers-yyyymmdd.log file, which seems free form. Different requesters leave different sets of headers.
In my current contract I had the opportunity to work with optical character recognition (OCR). We had over 50 documents in paper format that were published before 1991 that needed to get digitized and published on the internet. While these documents were old, they have really in-depth knowledge that simply needed to be shared with the world. OCR, however, has its quirks and is not all that straight forward. Some are due to the age and handling of the original documents over the years, and some are due to the original typographical or layout decisions of the original publishers. No matter the reason, they are not to be found and you need these documents on the internet, so the monkey is now on your back.
This is a preview of
Using Optical Character Recognition (OCR): Observations
.
Read the full post (834 words, 0 images, estimated 3:20 mins reading time)