Mail Archives: geda-user/2015/09/03/23:58:20

www.delorie.com/archives/browse.cgi

search

Mail Archives: geda-user/2015/09/03/23:58:20

X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f

X-Recipient: geda-user AT delorie DOT com

Date: Fri, 4 Sep 2015 06:00:42 +0200 (CEST)

X-X-Sender: igor2 AT igor2priv

To: "Ouabache Designworks (z3qmtr45 AT gmail DOT com) [via geda-user AT delorie DOT com]" <geda-user AT delorie DOT com>

X-Debug: to=geda-user AT delorie DOT com from="gedau AT igor2 DOT repo DOT hu"

From: gedau AT igor2 DOT repo DOT hu

Subject: Re: [geda-user] Interesting blog post from a commercial EDA vendor

- pdf

In-Reply-To: <CAOP4iL3YWQ_MH3HNnyDHMGCGeYFBmazwcw7Af_GATQzAUQJ57g@mail.gmail.com>

Message-ID: <alpine.DEB.2.00.1509040545240.6924@igor2priv>

References: <CAOP4iL3YWQ_MH3HNnyDHMGCGeYFBmazwcw7Af_GATQzAUQJ57g AT mail DOT gmail DOT com>

User-Agent: Alpine 2.00 (DEB 1167 2008-08-23)

MIME-Version: 1.0

Reply-To: geda-user AT delorie DOT com

Errors-To: nobody AT delorie DOT com

X-Mailing-List: geda-user AT delorie DOT com

X-Unsubscribes-To: listserv AT delorie DOT com

On Thu, 3 Sep 2015, Ouabache Designworks (z3qmtr45 AT gmail DOT com) [via geda-user AT delorie DOT com] wrote:

>
>https://medium.com/@zakhomuth/disrupting-electronic-design-automation-8988f
>72299e3

Btw, somewhat off-topic, the part not covered by geda-user discussions
usually: pdf datasheets. I really like his rant on how useless
distributing data in pdf is.

I face that problem from time to time. Last december I had it with an arm
cortex. I wanted to extract the register names, bit names and magic values
(e.g. this bit in this register always has to be 1). C source and
other stuff comes with an EULA that doesn't let me do what I want.
Datasheet is in pdf. Most of the relevant data are in almost uniform
tables.

I thought I'd just convert the pdf to html and extract <table> nodes... I
laugh at this idea in retrospect. I tried with various tools and various
settings. Never got a <table>. Turned out the pdf just draws the borders
and draws the text separately. The render looks like if it was a table.
The html some tools produce look the same as the pdf. In practice, it's
not a table in those htmls, just a big background bitmap with the lines
and the text printed onto it at pixel coords.

I ended up with a "table mapping" script that takes the bitmap, scans
lines and columns to map cell coordinates then reads all the text from the
html and determine which cell they are in.

And this is only the first step to convert the data of a datasheet
to a machine readable form on the lowest level... Upper levels in separate
scripts took the table map and tried to read the header and convert the
info into a register description.

I agree with the upverter guy. In the age of thousand page datasheets,
non-machine-readable format is a bug that needs to be fixed. On the other
hand I'm highly sceptic about vendors being cooperative on this.

Regards,

Igor2

- Raw text -

webmaster	delorie software privacy
Copyright © 2019 by DJ Delorie	Updated Jul 2019

X-Authentication-Warning:	delorie.com: mail set sender to geda-user-bounces using -f
X-Recipient:	geda-user AT delorie DOT com
Date:	Fri, 4 Sep 2015 06:00:42 +0200 (CEST)
X-X-Sender:	igor2 AT igor2priv
To:	"Ouabache Designworks (z3qmtr45 AT gmail DOT com) [via geda-user AT delorie DOT com]" <geda-user AT delorie DOT com>
X-Debug:	to=geda-user AT delorie DOT com from="gedau AT igor2 DOT repo DOT hu"
From:	gedau AT igor2 DOT repo DOT hu
Subject:	Re: [geda-user] Interesting blog post from a commercial EDA vendor
	- pdf
In-Reply-To:	<CAOP4iL3YWQ_MH3HNnyDHMGCGeYFBmazwcw7Af_GATQzAUQJ57g@mail.gmail.com>
Message-ID:	<alpine.DEB.2.00.1509040545240.6924@igor2priv>
References:	<CAOP4iL3YWQ_MH3HNnyDHMGCGeYFBmazwcw7Af_GATQzAUQJ57g AT mail DOT gmail DOT com>
User-Agent:	Alpine 2.00 (DEB 1167 2008-08-23)
MIME-Version:	1.0
Reply-To:	geda-user AT delorie DOT com
Errors-To:	nobody AT delorie DOT com
X-Mailing-List:	geda-user AT delorie DOT com
X-Unsubscribes-To:	listserv AT delorie DOT com