www.delorie.com/archives/browse.cgi   search  
Mail Archives: geda-user/2016/01/03/09:49:11

X-Authentication-Warning: delorie.com: mail set sender to geda-user-bounces using -f
X-Recipient: geda-user AT delorie DOT com
X-TCPREMOTEIP: 207.224.51.38
X-Authenticated-UID: jpd AT noqsi DOT com
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
Subject: Re: [geda-user] A fileformat library
X-Pgp-Agent: GPGMail 2.5.2
From: John Doty <jpd AT noqsi DOT com>
In-Reply-To: <CAC4O8c8OpV2zNoZF48N3_CQBdZVO40mcpD7fFGs5aZ_4r2QY7g@mail.gmail.com>
Date: Sun, 3 Jan 2016 07:48:52 -0700
Message-Id: <E537CDD8-BE58-4A3C-8D1C-0AC0D31A89E7@noqsi.com>
References: <1512221837 DOT AA25291 AT ivan DOT Harhan DOT ORG> <CAJXU7q_mXmipJ1fLvLpuLvnYjktV2SHoA+bG=L5+E-EfdygeOA AT mail DOT gmail DOT com> <s6n37uumanm DOT fsf AT blaulicht DOT dmz DOT brux> <CAJXU7q_qxdvJaejF-VcY=u7VHZ-zrfrc+Z7-qSwfFyPdy-umxw AT mail DOT gmail DOT com> <B02363CD-469D-493A-AC15-1D5DC7836982 AT noqsi DOT com> <20151222232230 DOT 12633 DOT qmail AT stuge DOT se> <0F6F1D0F-4F07-48EA-90FE-836EAD4E2354 AT noqsi DOT com> <CAM2RGhTficnys3a4xs=UBFvk8aPwpzYWUADFLP_pUQ+R1iKs0g AT mail DOT gmail DOT com> <0FCF3774-F93C-4BFF-BB61-636F75DCCACB AT noqsi DOT com> <CAC4O8c_UAiFE-vGfoE2tXppHLhaa0dSYz9o_rkdCBo7_SRRtxw AT mail DOT gmail DOT com> <FFBE7623-E240-4798-96B0-2BECF56C8E29 AT noqsi DOT com> <CAC4O8c980g1gj15=5njstC_BT-WYDgKQx9BRycdFKA8OvgtiOg AT mail DOT gmail DOT com> <B54C0E1F-1986-4C79-9F70-7F1919B8B26D AT noqsi DOT com> <CAC4O8c9bxJP1eMG4yz3YwKkQJRmsDGmLQ0aMd5pJRyu0WpdCtQ AT mail DOT gmail DOT com> <C1CFCCEE-C64A-4E49-AA64-446C061656D6 AT noqsi DOT com> <CAC4O8c-zt8B=joDd+ws77D2jt6aZf3MWfR_dAvpzGcNuBrTURQ AT mail DOT gmail DOT com> <alpine DOT DEB DOT 2 DOT 11 DOT 1601030040320 DOT 2176 AT newt> <CAC4O8c-5S-PgE=RFXrAG2xRzmV4x3odVip0eUwih-iEz!
Xs-UOg AT mail DOT gmail DOT com> <AE426D72-46DE-4941-9D58-95015A10C6EA AT noqsi DOT com> <CAC4O8c93CJm5LRehr28zzUTa6eqG9QQgBUMCz=zNpiwZPGOk4Q AT mail DOT gmail DOT com> <FCE4FD21-C98F-4860-B78B-EBF657427E91 AT noqsi DOT com> <CAC4O8c8OpV2zNoZF48N3_CQBdZVO40mcpD7fFGs5aZ_4r2QY7g AT mail DOT gmail DOT com>
To: geda-user AT delorie DOT com
X-Mailer: Apple Mail (2.1878.6)
Reply-To: geda-user AT delorie DOT com
Errors-To: nobody AT delorie DOT com
X-Mailing-List: geda-user AT delorie DOT com
X-Unsubscribes-To: listserv AT delorie DOT com

--Apple-Mail=_485C7A71-49FC-4AEF-8D2E-E7A4B212F0DE
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_4FFE1E41-CC5C-4240-B661-6AB5A5FFE42F"


--Apple-Mail=_4FFE1E41-CC5C-4240-B661-6AB5A5FFE42F
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252

Was the intended subject the *pcb* file format?

On Jan 3, 2016, at 12:53 AM, Britton Kerin (britton DOT kerin AT gmail DOT com) =
[via geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:

>=20
>=20
> On Sat, Jan 2, 2016 at 8:19 PM, John Doty <jpd AT noqsi DOT com> wrote:
>=20
> On Jan 2, 2016, at 9:27 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) =
[via geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:
>=20
>>=20
>>=20
>> On Sat, Jan 2, 2016 at 6:07 PM, John Doty <jpd AT noqsi DOT com> wrote:
>>=20
>> On Jan 2, 2016, at 7:47 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) =
[via geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:
>>=20
>>>=20
>>>=20
>>> On Sat, Jan 2, 2016 at 4:38 PM, John Doty <jpd AT noqsi DOT com> wrote:
>>>=20
>>> On Jan 2, 2016, at 6:07 PM, Britton Kerin (britton DOT kerin AT gmail DOT com) =
[via geda-user AT delorie DOT com] <geda-user AT delorie DOT com> wrote:
>>>=20
>>>> Personally I find formats like this:
>>>>=20
>>>>   device=3DRESISTOR
>>>>   T 44400 49300 5 10 1 1 90 0 1

So, the subject the .sch file format?

>>>>=20
>>>> substantially less readable than ones with field names, but they =
are indeed easy to parse.
>>>=20
>>> Personally, I rarely edit these things manually except for the text =
fields, which are not difficult to find. The fact that they=92re easy to =
parse is handy for automation.
>>>=20
>>>>   The pcb format is quite a bit more elaborate and the savings from =
not rolling your own parser are more significant.
>>>>=20
>>>> I think you're criteria for what should go in libgeda are spot-on =
btw.  Nor do I have any problem with a C interface calling python or =
gschem or for that matter C++.  I do think providing a clean C interface =
to libgeda gets by far the best return on investment, since it's so =
widely known and with a little care wrappers can then be provided almost =
automatically for a wide variety of languages (via SWIG or some other =
similar mechanism -- or maybe Xorn facilitates this, I'm a little =
unclear).
>>>=20
>>> I don=92t find deconstructing C data structures particularly easier =
than parsing the format above. Just another layer I have to penetrate to =
get to the data. I do significant processing with simple things like =
sed, which don=92t handle binary data.
>>>=20
>>> Wrappers CAN be provided, but will they? FFI programming is not the =
easiest thing. I hear  complaints about the need for developers to =
maintain code. It seems to me that one way to address these concerns is =
to avoid and eliminate unnecessary code.
>>>=20
>>> Good question.  It's a great result if you get it but a lot more =
work than using a serialization library, which is why the latter =
approach seems to me like a useful step in the right direction.
>> Serialization library? Why do you want a extra, unnecessary, opaque =
interface? What, exactly, are you trying to accomplish?
>>=20
>> Two things:
>>=20
>>     1.  A human- and partial-parser-script-readable format
>=20
> We have that, I think. But you left out the most important virtue: =
*simple*.
>=20
> I agree that it's readable enough, though it could be better.  I also =
agree that simplicity is good.
>=20
>>     2.  Full parsers for as many languages as possible without =
writing them by hand
>=20
> So instead, you need to write an interface between a complicated =
parser and every application by hand. Where=92s the gain?
>=20
> Here's what YAML looks like from perl:
>=20
>      use YAML::XS;
>=20
>      my $yaml =3D Dump [ 1..4 ];
>      my $array =3D Load $yaml;

But you left out the next step: you have to deconstruct whatever it =
built to do anything with it. To do that, you have to understand the =
construction. While if I simply read the file (the format is too trivial =
for a reader to deserve the name =93parser"), I go directly to *my* =
application=92s model of the underlying data, on a trajectory that =
matches what I need. Your example seems to build some sort of complex =
data structure. What if a line by line data-driven approach is more =
natural?

>=20
> The gain is that this is a vastly easier way to vivify a saved object =
that to write my own parser,

I disagree. If the format were more complicated, you=92d have a point, =
but it=92s not.

> or even my own partial parser for non-trivial cases.
>> Now take a look at the design goals for YAML:
>>=20
>>     http://www.yaml.org/spec/1.2/spec.html#id2708649
>>=20
>> It's a good fit.  If it was only a matter of the technical merits I =
would say as close to perfect as it gets with software.
>=20
> Compare it to http://wiki.geda-project.org/geda:file_format_spec
> YAML is enormously more complex to no advantage for us.
>=20
> The point is that you don't have to deal with any of that complexity

It becomes a dependency for *every* tool. It will break (they always =
do). It won=92t work with *every* language on every OS.

> (of which there really isn't all that much -- calling it enormously =
complex is a big overstatement).  It's a library with approximately two =
entry points per language for modern languages, and not much more for C.

AWK?

> Parsing may be a non-issue for you if you only care about strings in =
.sch files, but for many useful operations on pcbs you need the whole =
thing, or most of it.

Pcb may need it, but that=92s a completely different issue. We=92re =
talking about .sch files.

Can we *please* separate the projects so that we don=92t keep going =
through this kind of thing?

>> Unfortunately there's the usual good-versus-most-popular trade-off in =
deciding between YAML and JSON.  I still favor YAML in this case, =
largely because I can't look at people like you and honestly claim that =
JSON is in all respects fun to read/edit/sed over etc., and because my =
personal experience with JSON is that although the parsers are truly =
ubiquitous they have some annoying characteristics  (at least the Perl =
one does).
> But since it doesn=92t relieve the need of the application programmer =
to understand the interface, it is merely adding more code for no gain =
(or even
>=20
> I'm not sure what you mean by this.  The programmer needs to =
understand what the fields mean, sure.  YAML/JSON helps somewhat with =
this, because the fields have names.  Even if you do understand the =
existing format, that understanding that will absolutely not get you a =
live editable version of what's in a pcb file without a lot of =
(pointless) additional work.

We=92re not talking about pcb. What you showed above is .sch.

>=20
> negative gain, given the added complexity). And neither YAML nor JSON =
is as universally readable and processable as the format we have.
>=20
> There's no added complexity to speak of for clients, and YAML is far =
more readable and at least as processable as what we have now.  I think =
your view of things is strongly tied to your particular use case.

Cases. Many cases.

> It sounds like you mostly work on attributes with their own special =
meaning (IIRC noqsi has attributes with their own syntax),

Perhaps you mean gnet-spice-noqsi? All of the simulation flows have =
special attributes beyond what pcb uses. Some PCB layout flows do too =
(do you know why some library symbols have pins=3D and class=3D?).

> and don't have to parse everything.  That's fine.  I sure don't want =
to break anything for you.

It sounds like you do. It sounds like you=92re volunteering to rewrite =
everybody=92s custom scripts for things like symbol generation and =
refdes renumbering.

>=20
> However, if you consider the actual problem I'm hoping to address you =
might sympathize at least with the thought that not reinventing the =
parser everywhere might be worthwhile.

=93Reinvent=94 and =93parser=94 imply a difficulty that we don=92t have. =
Is it really that hard to compose a few strings like:

"T %d %d %d %d %d %d %d %d %d=94

Or just pick out a field you need with something like (data-driven AWK):

/^T /{
	numlines =3D $10
	/* do something with the text lines */
}

> I started out to write a quick parser in perl,

Why did you do that? Why didn=92t you just start out to write tool in =
Perl, with reading happening as the tool needs it? If you approach the =
problem assuming it=92s hard, you make it hard.

> in exactly the way you seem to be proposing should be the way to do =
everything.

You didn=92t understand.

>   It's a significant hassle and you end up with a slow parser that =
only works from one language.  As you've pointed out yourself, parsing =
(and serialization) is a relatively trivial, thoroughly solved problem.  =
Why reinvent the solution?

1. The project has a significant investment in the other approach =
(tragesym, refdes_renum).

2. Users have an enormous investment: there are lots of custom scripts =
that various people have mooted here over the years, and I=92d guess =
there are many more private ones.

3. You=92re using heavy words: =93parser=94, =93reinvent=94, etc. for a =
lightweight job.

4. The job of decoding the output of a universal parser isn=92t much, if =
at all, simpler than just reading the file if the file encoding is =
simple.

5. Serialization is appropriate when you start from well-defined common =
binary data structures. We don=92t have that. We have a well-defined =
common text format.

> I've taken some time over this because at least one other person =
indicated that they shared your concern about using a generic parser =
rather than an arbitrary custom format.  So I'd like to actually =
convince you, lest you convince others that doing as I propose is a bad =
idea for pcb.

It might not be a bad idea for pcb.

>   I'd also like to apologize for bad attitude and rudeness I've shown =
you in the past, and hope you're able to view this issue in technical =
terms alone (I confess that I sometimes have difficulty doing this with =
your emails).

No apology needed. I=92m a scientist, I=92m used to this. I recall the =
time I was publicly accused of =93witchcraft=94 for digging out a result =
using an unfamiliar statistical method. The accuser then went back to =
her lab, took another look at her own data, and proved I was right. We =
never saw a dispute as personal. Scientific research is like this: we =
have titanic arguments with our friends. That=92s how we bring all of =
the facts and ideas to light and get the science right.

>=20
> Britton
>=20

John Doty              Noqsi Aerospace, Ltd.
http://www.noqsi.com/
jpd AT noqsi DOT com



--Apple-Mail=_4FFE1E41-CC5C-4240-B661-6AB5A5FFE42F
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=windows-1252

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dwindows-1252"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Was =
the intended subject the *pcb* file =
format?<br><div><div><br></div><div>On Jan 3, 2016, at 12:53 AM, Britton =
Kerin (<a =
href=3D"mailto:britton DOT kerin AT gmail DOT com">britton DOT kerin AT gmail DOT com</a>) =
[via <a href=3D"mailto:geda-user AT delorie DOT com">geda-user AT delorie DOT com</a>] =
&lt;<a href=3D"mailto:geda-user AT delorie DOT com">geda-user AT delorie DOT com</a>&gt;=
 wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div =
class=3D"gmail_quote">On Sat, Jan 2, 2016 at 8:19 PM, John Doty <span =
dir=3D"ltr">&lt;<a href=3D"mailto:jpd AT noqsi DOT com" =
target=3D"_blank">jpd AT noqsi DOT com</a>&gt;</span> wrote:<br><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word"><br><div><span class=3D""><div>On Jan 2, =
2016, at 9:27 PM, Britton Kerin (<a =
href=3D"mailto:britton DOT kerin AT gmail DOT com" =
target=3D"_blank">britton DOT kerin AT gmail DOT com</a>) [via <a =
href=3D"mailto:geda-user AT delorie DOT com" =
target=3D"_blank">geda-user AT delorie DOT com</a>] &lt;<a =
href=3D"mailto:geda-user AT delorie DOT com" =
target=3D"_blank">geda-user AT delorie DOT com</a>&gt; =
wrote:</div><br><blockquote type=3D"cite"><div dir=3D"ltr"><br><div =
class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Sat, Jan 2, 2016 =
at 6:07 PM, John Doty <span dir=3D"ltr">&lt;<a =
href=3D"mailto:jpd AT noqsi DOT com" =
target=3D"_blank">jpd AT noqsi DOT com</a>&gt;</span> wrote:<br><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word"><br><div><div>On Jan 2, 2016, at 7:47 PM, =
Britton Kerin (<a href=3D"mailto:britton DOT kerin AT gmail DOT com" =
target=3D"_blank">britton DOT kerin AT gmail DOT com</a>) [via <a =
href=3D"mailto:geda-user AT delorie DOT com" =
target=3D"_blank">geda-user AT delorie DOT com</a>] &lt;<a =
href=3D"mailto:geda-user AT delorie DOT com" =
target=3D"_blank">geda-user AT delorie DOT com</a>&gt; =
wrote:</div><br><blockquote type=3D"cite"><div dir=3D"ltr"><br><div =
class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Sat, Jan 2, 2016 =
at 4:38 PM, John Doty <span dir=3D"ltr">&lt;<a =
href=3D"mailto:jpd AT noqsi DOT com" =
target=3D"_blank">jpd AT noqsi DOT com</a>&gt;</span> wrote:<br><blockquote =
class=3D"gmail_quote" style=3D"margin: 0px 0px 0px 0.8ex; =
border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex; position: static; z-index: =
auto;"><div style=3D"word-wrap:break-word"><br><div><span><div>On Jan 2, =
2016, at 6:07 PM, Britton Kerin (<a =
href=3D"mailto:britton DOT kerin AT gmail DOT com" =
target=3D"_blank">britton DOT kerin AT gmail DOT com</a>) [via <a =
href=3D"mailto:geda-user AT delorie DOT com" =
target=3D"_blank">geda-user AT delorie DOT com</a>] &lt;<a =
href=3D"mailto:geda-user AT delorie DOT com" =
target=3D"_blank">geda-user AT delorie DOT com</a>&gt; =
wrote:</div><br><blockquote type=3D"cite"><div =
style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-varia=
nt:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text=
-align:start;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px">Personally I find formats like this:<br><br>&nbsp; =
device=3DRESISTOR<br>&nbsp; T 44400 49300 5 10 1 1 90 0 =
1<br></div></blockquote></span></div></div></blockquote></div></div></div>=
</blockquote></div></div></blockquote></div></div></div></blockquote></spa=
n></div></div></blockquote></div></div></div></blockquote><div><br></div>S=
o, the subject the .sch file format?</div><div><br><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word"><div><span class=3D""><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word"><div><blockquote type=3D"cite"><div =
dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin: =
0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, =
204, 204); border-left-style: solid; padding-left: 1ex; position: =
static; z-index: auto;"><div =
style=3D"word-wrap:break-word"><div><span><blockquote type=3D"cite"><div =
style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-varia=
nt:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text=
-align:start;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px"><br>substantially less readable than ones with field names, =
but they are indeed easy to =
parse.</div></blockquote><div><br></div></span>Personally, I rarely edit =
these things manually except for the text fields, which are not =
difficult to find. The fact that they=92re easy to parse is handy for =
automation.</div><div><br><blockquote type=3D"cite"><span><div =
style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-varia=
nt:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text=
-align:start;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px">&nbsp; The pcb format is quite a bit more elaborate and the =
savings from not rolling your own parser are more =
significant.<br></div><div =
style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-varia=
nt:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text=
-align:start;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px"><br></div></span><span><div =
style=3D"font-family:Helvetica;font-size:12px;font-style:normal;font-varia=
nt:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text=
-align:start;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px">I think you're criteria for what should go in libgeda are =
spot-on btw.&nbsp; Nor do I have any problem with a C interface calling =
python or gschem or for that matter C++.&nbsp; I do think providing a =
clean C interface to libgeda gets by far the best return on investment, =
since it's so widely known and with a little care wrappers can then be =
provided almost automatically for a wide variety of languages (via SWIG =
or some other similar mechanism -- or maybe Xorn facilitates this, I'm a =
little unclear).</div></span></blockquote><br></div><div>I don=92t find =
deconstructing C data structures particularly easier than parsing the =
format above. Just another layer I have to penetrate to get to the data. =
I do significant processing with simple things like sed, which don=92t =
handle binary data.</div><div><br></div><div>Wrappers CAN be provided, =
but will they? FFI programming is not the easiest thing. I hear =
&nbsp;complaints about the need for developers to maintain code. It =
seems to me that one way to address these concerns is to avoid and =
eliminate unnecessary =
code.</div></div></blockquote><div><br></div><div>Good question.&nbsp; =
It's a great result if you get it but a lot more work than using a =
serialization library, which is why the latter approach seems to me like =
a useful step in the right =
direction.</div></div></div></div></blockquote>Serialization library? =
Why do you want a extra, unnecessary, opaque interface? What, exactly, =
are you trying to =
accomplish?</div></div></blockquote><div><br></div><div>Two =
things:&nbsp;</div><div><div><br></div><div>&nbsp; &nbsp; 1.&nbsp; A =
human- and partial-parser-script-readable =
format</div></div></div></div></div></blockquote><div><br></div></span>We =
have that, I think. But you left out the most important virtue: =
*simple*.</div></div></blockquote><div><br></div><div style=3D"">I agree =
that it's readable enough, though it could be better.&nbsp; I also agree =
that simplicity is good.</div><div>&nbsp;</div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div style=3D"word-wrap:break-word"><span =
class=3D""><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote">&nbsp; &nbsp; 2.&nbsp; =
Full parsers for as many languages as possible without writing them by =
hand</div></div></div></blockquote><div><br></div></span>So instead, you =
need to write an interface between a complicated parser and every =
application by hand. Where=92s the =
gain?</div></blockquote><div><br></div><div style=3D"">Here's what YAML =
looks like from perl:</div><div style=3D""><br></div><div =
style=3D"">&nbsp; &nbsp; &nbsp;use =
YAML::XS;<br></div><div><br></div><div>&nbsp; &nbsp; &nbsp;my $yaml =3D =
Dump [ 1..4 ];</div><div>&nbsp; &nbsp; &nbsp;my $array =3D Load =
$yaml;</div></div></div></div></blockquote><div><br></div>But you left =
out the next step: you have to deconstruct whatever it built to do =
anything with it. To do that, you have to understand the construction. =
While if I simply read the file (the format is too trivial for a reader =
to deserve the name =93parser"), I go directly to *my* application=92s =
model of the underlying data, on a trajectory that matches what I need. =
Your example seems to build some sort of complex data structure. What if =
a line by line data-driven approach is more =
natural?</div><div><br></div><div><blockquote type=3D"cite"><div =
dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div><br></div><div style=3D"">The gain is that =
this is a vastly easier way to vivify a saved object that to write my =
own parser, </div></div></div></div></blockquote><br>I disagree. If the =
format were more complicated, you=92d have a point, but it=92s =
not.</div><div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote"><div style=3D"">or even =
my own partial parser for non-trivial cases.</div><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div =
style=3D"word-wrap:break-word"><div><span class=3D""><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div>Now take a look at the design goals for =
YAML:</div><div><br></div><div>&nbsp; &nbsp; <a =
href=3D"http://www.yaml.org/spec/1.2/spec.html#id2708649" =
target=3D"_blank">http://www.yaml.org/spec/1.2/spec.html#id2708649</a></di=
v><div><br></div><div>It's a good fit.&nbsp; If it was only a matter of =
the technical merits I would say as close to perfect as it gets with =
software.</div></div></div></div></blockquote><div><br></div></span>Compar=
e it to&nbsp;<a =
href=3D"http://wiki.geda-project.org/geda:file_format_spec" =
target=3D"_blank">http://wiki.geda-project.org/geda:file_format_spec</a></=
div><div>YAML is enormously more complex to no advantage for =
us.</div></div></blockquote><div><br></div><div style=3D"">The point is =
that you don't have to deal with any of that =
complexity</div></div></div></div></blockquote><div><br></div>It becomes =
a dependency for *every* tool. It will break (they always do). It won=92t =
work with *every* language on every OS.</div><div><br><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div style=3D""> (of which there really isn't all =
that much -- calling it enormously complex is a big =
overstatement).&nbsp; It's a library with approximately two entry points =
per language for modern languages, and not much more for =
C.&nbsp;</div></div></div></div></blockquote><div><br></div>AWK?</div><div=
><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote"><div style=3D""> =
Parsing may be a non-issue for you if you only care about strings in =
.sch files, but for many useful operations on pcbs you need the whole =
thing, or most of =
it.</div></div></div></div></blockquote><div><br></div>Pcb may need it, =
but that=92s a completely different issue. We=92re talking about .sch =
files.</div><div><br></div><div>Can we *please* separate the projects so =
that we don=92t keep going through this kind of =
thing?</div><div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote"><blockquote =
class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div style=3D"word-wrap:break-word"><span =
class=3D""><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote">Unfortunately there's =
the usual good-versus-most-popular trade-off in deciding between YAML =
and JSON.&nbsp; I still favor YAML in this case, largely because I can't =
look at people like you and honestly claim that JSON is in all respects =
fun to read/edit/sed over etc., and because my personal experience with =
JSON is that although the parsers are truly ubiquitous they have some =
annoying characteristics &nbsp;(at least the Perl one =
does).</div></div></div></blockquote></span>But since it doesn=92t =
relieve the need of the application programmer to understand the =
interface, it is merely adding more code for no gain (or =
even</div></blockquote><div><br></div><div style=3D"">I'm not sure what =
you mean by this.&nbsp; The programmer needs to understand what the =
fields mean, sure.&nbsp; YAML/JSON helps somewhat with this, because the =
fields have names.&nbsp; Even if you do understand the existing format, =
that understanding that will absolutely not get you a live editable =
version of what's in a pcb file without a lot of (pointless) additional =
work.</div></div></div></div></blockquote><div><br></div>We=92re not =
talking about pcb. What you showed above is =
.sch.</div><div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div><br></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0px 0px 0px =
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><div style=3D"word-wrap:break-word"> =
negative gain, given the added complexity). And neither YAML nor JSON is =
as universally readable and processable as the format we =
have.</div></blockquote><div><br></div><div style=3D"">There's no added =
complexity to speak of for clients, and YAML is far more readable and at =
least as processable as what we have now.&nbsp; I think your view of =
things is strongly tied to your particular use =
case.&nbsp;</div></div></div></div></blockquote><div><br></div>Cases. =
Many cases.</div><div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote"><div style=3D""> It =
sounds like you mostly work on attributes with their own special meaning =
(IIRC noqsi has attributes with their own =
syntax),</div></div></div></div></blockquote><div><br></div>Perhaps you =
mean gnet-spice-noqsi? All of the simulation flows have special =
attributes beyond what pcb uses. Some PCB layout flows do too (do you =
know why some library symbols have pins=3D and =
class=3D?).</div><div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div =
class=3D"gmail_extra"><div class=3D"gmail_quote"><div style=3D""> and =
don't have to parse everything.&nbsp; That's fine.&nbsp; I sure don't =
want to break anything for =
you.</div></div></div></div></blockquote><div><br></div>It sounds like =
you do. It sounds like you=92re volunteering to rewrite everybody=92s =
custom scripts for things like symbol generation and refdes =
renumbering.</div><div><br><blockquote type=3D"cite"><div dir=3D"ltr"><div=
 class=3D"gmail_extra"><div class=3D"gmail_quote"><div =
style=3D""><br></div><div style=3D"">However, if you consider the actual =
problem I'm hoping to address you might sympathize at least with the =
thought that not reinventing the parser everywhere might be =
worthwhile.&nbsp;</div></div></div></div></blockquote><div><br></div>=93Re=
invent=94 and =93parser=94 imply a difficulty that we don=92t have. Is =
it really that hard to compose a few strings =
like:</div><div><br></div><div>"T %d %d %d %d %d %d %d %d =
%d=94</div><div><br></div><div>Or just pick out a field you need with =
something like (data-driven AWK):</div><div><br></div><div>/^T =
/{</div><div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>numlines =3D $10</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>/* do something with the text =
lines */</div><div>}</div><div><br><blockquote type=3D"cite"><div =
dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div =
style=3D""> I started out to write a quick parser in =
perl,</div></div></div></div></blockquote><div><br></div>Why did you do =
that? Why didn=92t you just start out to write tool in Perl, with =
reading happening as the tool needs it? If you approach the problem =
assuming it=92s hard, you make it hard.</div><div><br><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div style=3D""> in exactly the way you seem to be =
proposing should be the way to do =
everything.</div></div></div></div></blockquote><div><br></div>You =
didn=92t understand.</div><div><br><blockquote type=3D"cite"><div =
dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div =
style=3D"">&nbsp; It's a significant hassle and you end up with a slow =
parser that only works from one language.&nbsp; As you've pointed out =
yourself, parsing (and serialization) is a relatively trivial, =
thoroughly solved problem.&nbsp; Why reinvent the =
solution?</div></div></div></div></blockquote><div><br></div>1. The =
project has a significant investment in the other approach (tragesym, =
refdes_renum).&nbsp;</div><div><br></div><div>2. Users have an enormous =
investment: there are lots of custom scripts that various people have =
mooted here over the years, and I=92d guess there are many more private =
ones.</div><div><br></div><div>3. You=92re using heavy words: =93parser=94=
, =93reinvent=94, etc. for a lightweight =
job.</div><div><br></div><div>4. The job of decoding the output of a =
universal parser isn=92t much, if at all, simpler than just reading the =
file if the file encoding is simple.</div><div><br></div><div>5. =
Serialization is appropriate when you start from well-defined common =
binary data structures. We don=92t have that. We have a well-defined =
common text format.</div><div><br></div><div><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div style=3D"">I've taken some time over this =
because at least one other person indicated that they shared your =
concern about using a generic parser rather than an arbitrary custom =
format.&nbsp; So I'd like to actually convince you, lest you convince =
others that doing as I propose is a bad idea for =
pcb.</div></div></div></div></blockquote><div><br></div>It might not be =
a bad idea for pcb.</div><div><br><blockquote type=3D"cite"><div =
dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div =
style=3D"">&nbsp; I'd also like to apologize for bad attitude and =
rudeness I've shown you in the past, and hope you're able to view this =
issue in technical terms alone (I confess that I sometimes have =
difficulty doing this with your =
emails).</div></div></div></div></blockquote><div><br></div>No apology =
needed. I=92m a scientist, I=92m used to this. I recall the time I was =
publicly accused of =93witchcraft=94 for digging out a result using an =
unfamiliar statistical method. The accuser then went back to her lab, =
took another look at her own data, and proved I was right. We never saw =
a dispute as personal. Scientific research is like this: we have titanic =
arguments with our friends. That=92s how we bring all of the facts and =
ideas to light and get the science right.</div><div><br><blockquote =
type=3D"cite"><div dir=3D"ltr"><div class=3D"gmail_extra"><div =
class=3D"gmail_quote"><div style=3D""><br></div><div =
style=3D"">Britton</div><div><br></div></div></div></div>
</blockquote></div><br><div apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
border-spacing: 0px;"><p style=3D"margin: 0.0px 0.0px 0.0px 0.0px"><font =
face=3D"Helvetica" size=3D"3" style=3D"font: 12.0px Helvetica">John =
Doty<span class=3D"Apple-converted-space">&nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;<span class=3D"Apple-converted-space">&nbsp;</span><span =
class=3D"Apple-converted-tab">&nbsp; &nbsp;<span =
class=3D"Apple-converted-space">&nbsp;</span></span></span>Noqsi =
Aerospace, Ltd.</font></p><p style=3D"margin: 0.0px 0.0px 0.0px =
0.0px"><a href=3D"http://www.noqsi.com/">http://www.noqsi.com/</a></p><p =
style=3D"margin: 0.0px 0.0px 0.0px 0.0px"><font face=3D"Helvetica" =
size=3D"3" style=3D"font: 12.0px Helvetica"><a =
href=3D"mailto:jpd AT noqsi DOT com">jpd AT noqsi DOT com</a></font></p><br =
class=3D"Apple-interchange-newline"></span>
</div>
<br></body></html>=

--Apple-Mail=_4FFE1E41-CC5C-4240-B661-6AB5A5FFE42F--

--Apple-Mail=_485C7A71-49FC-4AEF-8D2E-E7A4B212F0DE
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWiTTVAAoJEF1Aj/0UKykRlhoQAIK6B1yx58C6R8QHWhCAUYtJ
SNWKe92XCoHvDMtHCN+pagynkeJay2NsnbjBXwcGiF64QcqCJv3oTssW+lEhPw3m
KvKDU4KGkZ38ENSJKHgdLfiEiiOznEgM27PVq6PVXQrH0mpxqtuiRCCLxJIZlKDR
pW+5qxSWgAzKA2bHTOc5Sh6TS4ClQyIxLhFxYaDpZxpC2ZW+tMCH9glmdLG1ImMD
7CIbB3RJaaOAGRsRNTHQYa5mKBJfFMWdnzSZ9L7bR/zyFfKgIKy798o3Ft4DM/1F
u/lYYu5ezZDyeWY8cHmcYRRtUoS4AyjDxXTkPgAIlKbXX4HqOESGRui2Ai9ELQXO
z6kSRbeC0nwXEumSISM6l4olOSZW4MkH/8rMCoRUqk+XUFTg2NSn5H4HMuA2r1eI
Y7zkqKQsre3Qv1d6IQhO72WOW3q6cDxMu02aWK7YZvuR0I1NLLqcpwJ9zWL7gYdh
UBpjEIgaYYm4uaWk6+eNlNQDx5wgZOEGa++qhfXx1A9wOdvmlKuhMuWEvQo5XaIb
1OHOWOfGeknzyyT6bT/lzJ17bLQZTUK4W33PTImHsoLlO9xChWwBX4oXjKLhzpGC
ur1KmigyRGv1JPoBj+JwjVjF8MZ3SLvTIwoj56D59vxHaEfycayE6zLG+R7K66YV
W2QrgmTzrRmAhtC3V8Qr
=TvVk
-----END PGP SIGNATURE-----

--Apple-Mail=_485C7A71-49FC-4AEF-8D2E-E7A4B212F0DE--

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019