[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: DDP CVS commit by jseidel: ddp/manuals.sgml/release-notes/de release-note ...



On Sun, Jan 09, 2005 at 09:38:12PM +0100, Jens Seidel wrote:
> On Sun, Jan 09, 2005 at 08:04:58PM +0100, Geert Stappers wrote:
> > On Sun, Jan 09, 2005 at 06:24:17PM +0100, Jens Seidel wrote:
> > > On Sun, Jan 09, 2005 at 05:57:22PM +0100, Geert Stappers wrote:
> > > > On Sun, Jan 09, 2005 at 09:30:55AM -0700, DDP CVS wrote:
> >  <snip/>
> > > > > Modified files:
> > > > > 	manuals.sgml/release-notes/de: release-notes.de.sgml 
> > > > > 
> > > > > Log message:
> > > > > 	use latin1 encoding instead of HTML entities to simplify proofreading
> > > > >       and to increase compatibility with various tools
> > > > 
> > > > I think you broke compatibility with XML & SGML tools. [1]
> > > > [1] both are ASCII only.

I should have react with:

  Why latin1 and how does it fit in XML & SGML tools?
  I think you broke compatibility with those tools.

I was wrong about the ASCII-only argument.

> > > The old code using &uuml; and &szlig; in each third word makes the SGML
> > > source code nearly unreadable. Please note that the previous version
> > > already contained a mixture of latin1 and ASCII.
> [snip]
> > > I also found a missing umlaut in the output (last PDF page) because
> > > <url id="&url-debian-i18n;" name="&Uuml;bersetzungen">
> > > was used but which works great using name="Übersetzungen".
> > 
> > My concern is that the source is not pure 7-bit ASCII.
> 
> Please note that all versions of Release Notes (except English one)
> use 8bit characters. Do you really expect that a Japanese translator
> writes &entity1;&entity2;&entity3;...?
> 
> > It should ASCII only for XML and SGML.
> 
> But debiandoc-sgml supports all common locales, especially latin1,
> latin2,...
> 
> > Jens, that you spend time on the release notes is good.
> > I do respect that.
> > 
> > But your arguments to break stuff are poor.
> 
> Maybe. I agree that pure ASCII has advantages but the current file can
> easily converted to another locale using iconv, recode, konwert, ...
> 
> > * it is hard to proofread gr&ouml;&szlig;e dateien
> 
> It is!
> 
> > Consider the SGML source as computer program source
> > and proofreading is running the programm.
> > Each "bug" you find, has to be modified in the source.
> > The edit-compile-test cycle can indeed be boring,
> > you shouldn't cheat by implementing "compiled blobs"
> > in the source code.
> 
> I would really like to know other people's opinions.
> 
> > As one volunteer to another volunteer:
> > 
> >   Please revert the latin1 changes
> 
> First I will wait for more feedback. If the majority agrees with you I
> will revert my changes (but what about name="&Uuml;bersetzungen" which
> doesn't work for url tags?).

Here feedback from me, based on my previous posting.

| My concern is that the source is not pure 7-bit ASCII.
| It should ASCII only for XML and SGML.

I was wrong, at least incomplete,  ISO 10646 is allowed.

| Jens, that you spend time on the release notes is good.
| I do respect that.
I stay with that :-)

| But your arguments to break stuff are poor.
I shall calm down.

| * The website has latin1
| That is because it is converted to latin1
| With latin1 "precompiled" codes, you can't convert to other encodings
See below

| * aspell can't handle HTML entities.
| Then use another tool or aspell on another file format
| 
| * the source had already latin1 codes.
| That has you set on the wrong track,
| but is no excuus to go further downhill
See below

| * it is hard to proofread gr&ouml;&szlig;e dateien
| Consider the SGML source as computer program source
| and proofreading is running the programm.
| Each "bug" you find, has to be modified in the source.
| The edit-compile-test cycle can indeed be boring,
| you shouldn't cheat by implementing "compiled blobs"
| in the source code.
The blobs are allowed, when they are UTF-8 or UTF-16 encoded.

| As one volunteer to another volunteer:
| 
|   Please revert the latin1 changes

I couldn't find where latin1 and UTF-8 differ for our usage,
release-notes.de.sgml.  I could be that we are save.


I did found that 

latin1   ~~  rfc1345  ~~ iso 8859
UTF-8    ~~  rfc3629  ~~ iso 10646


But not that &euml; is for both the same.
Where it now cost time, let _assume_ they are equal.[1]


Cheers
Geert Stappers

[1] please prove me wrong,
 but a confirmation is also welcome.

Attachment: signature.asc
Description: Digital signature


Reply to: