Chapter 12. Programming

Table of Contents

12.1. The shell script
12.1.1. POSIX shell compatibility
12.1.2. Shell parameters
12.1.3. Shell conditionals
12.1.4. Shell loops
12.1.5. The shell command-line processing sequence
12.1.6. Utility programs for shell script
12.1.7. Shell script dialog
12.1.8. Shell script example with zenity
12.2. Make
12.3. C
12.3.1. Simple C program (gcc)
12.4. Debug
12.4.1. Basic gdb execution
12.4.2. Debugging the Debian package
12.4.3. Obtaining backtrace
12.4.4. Advanced gdb commands
12.4.5. Debugging X Errors
12.4.6. Check dependency on libraries
12.4.7. Memory leak detection tools
12.4.8. Static code analysis tools
12.4.9. Disassemble binary
12.5. Flex — a better Lex
12.6. Bison — a better Yacc
12.7. Autoconf
12.7.1. Compile and install a program
12.7.2. Uninstall program
12.8. Perl short script madness
12.9. Web
12.10. The source code translation
12.11. Making Debian package

I provide some pointers for people to learn programming on the Debian system enough to trace the packaged source code. Here are notable packages and corresponding documentation packages for programing.

Table 12.1. List of packages to help programing

package popcon size documentation
autoconf V:27, I:219 1898 "info autoconf" provided by autoconf-doc
automake V:25, I:206 1699 "info automake" provided by automake1.10-doc
bash V:845, I:999 5363 "info bash" provided by bash-doc
bison V:15, I:127 2201 "info bison" provided by bison-doc
cpp V:422, I:853 22 "info cpp" provided by cpp-doc
ddd V:1, I:21 3628 "info ddd" provided by ddd-doc
exuberant-ctags V:6, I:41 289 exuberant-ctags(1)
flex V:14, I:117 1288 "info flex" provided by flex-doc
gawk V:275, I:326 1852 "info gawk" provided by gawk-doc
gcc V:186, I:688 7 "info gcc" provided by gcc-doc
gdb V:34, I:166 6300 "info gdb" provided by gdb-doc
gettext V:56, I:358 6408 "info gettext" provided by gettext-doc
gfortran V:8, I:63 2 "info gfortran" provided by gfortran-doc (Fortran 95)
fpc I:5 42 fpc(1) and html by fp-docs (Pascal)
glade V:1, I:16 3152 help provided via menu (UI Builder)
libc6 V:918, I:998 10280 "info libc" provided by glibc-doc and glibc-doc-reference
make V:184, I:655 1291 "info make" provided by make-doc
xutils-dev V:3, I:32 1432 imake(1), xmkmf(1), etc.
mawk V:554, I:997 198 mawk(1)
perl V:758, I:994 17566 perl(1) and html pages provided by perl-doc and perl-doc-html
python V:710, I:985 571 python(1) and html pages provided by python-doc
tcl8.4 V:17, I:202 167 tcl(3) and detail manual pages provided by tcl8.4-doc
tk8.4 V:11, I:135 168 tk(3) and detail manual pages provided by tk8.4-doc
ruby V:62, I:268 47 ruby(1) and interactive reference provided by ri
vim V:148, I:379 2063 help(F1) menu provided by vim-doc
susv2 I:0 39 fetch "The Single UNIX Specifications v2"
susv3 I:0 39 fetch "The Single UNIX Specifications v3"

Online references are available by typing "man name" after installing manpages and manpages-dev packages. Online references for the GNU tools are available by typing "info program_name" after installing the pertinent documentation packages. You may need to include the contrib and non-free archives in addition to the main archive since some GFDL documentations are not considered to be DFSG compliant.

[Warning] Warning

Do not use "test" as the name of an executable test file. "test" is a shell builtin.

[Caution] Caution

You should install software programs directly compiled from source into "/usr/local" or "/opt" to avoid collision with system programs.

[Tip] Tip

Code examples of creating "Song 99 Bottles of Beer" should give you good ideas of practically all the programming languages.

The shell script is a text file with the execution bit set and contains the commands in the following format.

#!/bin/sh
 ... command lines

The first line specifies the shell interpreter which read and execute this file contents.

Reading shell scripts is the best way to understand how a Unix-like system works. Here, I give some pointers and reminders for shell programming. See "Shell Mistakes" (http://www.greenend.org.uk/rjk/2001/04/shell.html) to learn from mistakes.

Unlike shell interactive mode (see Section 1.5, “The simple shell command” and Section 1.6, “Unix-like text processing”), shell scripts frequently use parameters, conditionals, and loops.

Each command returns an exit status which can be used for conditional expressions.

  • Success: 0 ("True")

  • Error: non 0 ("False")

[Note] Note

"0" in the shell conditional context means "True", while "0" in the C conditional context means "False".

[Note] Note

"[" is the equivalent of the test command, which evaluates its arguments up to "]" as a conditional expression.

Basic conditional idioms to remember are the following.

  • "<command> && <if_success_run_this_command_too> || true"

  • "<command> || <if_not_success_run_this_command_too> || true"

  • A multi-line script snippet as the following

if [ <conditional_expression> ]; then
 <if_success_run_this_command>
else
 <if_not_success_run_this_command>
fi

Here trailing "|| true" was needed to ensure this shell script does not exit at this line accidentally when shell is invoked with "-e" flag.



Arithmetic integer comparison operators in the conditional expression are "-eq", "-ne", "-lt", "-le", "-gt", and "-ge".

The shell processes a script roughly as the following sequence.

  • The shell reads a line.

  • The shell groups a part of the line as one token if it is within "…" or '…'.

  • The shell splits other part of a line into tokens by the following.

    • Whitespaces: <space> <tab> <newline>

    • Metacharacters: < > | ; & ( )

  • The shell checks the reserved word for each token to adjust its behavior if not within "…" or '…'.

    • reserved word: if then elif else fi for in while unless do done case esac

  • The shell expands alias if not within "…" or '…'.

  • The shell expands tilde if not within "…" or '…'.

    • "~" → current user's home directory

    • "~<user>" → <user>'s home directory

  • The shell expands parameter to its value if not within '…'.

    • parameter: "$PARAMETER" or "${PARAMETER}"

  • The shell expands command substitution if not within '…'.

    • "$( command )" → the output of "command"

    • "` command `" → the output of "command"

  • The shell expands pathname glob to matching file names if not within "…" or '…'.

    • * → any characters

    • ? → one character

    • […] → any one of the characters in ""

  • The shell looks up command from the following and execute it.

    • function definition

    • builtin command

    • executable file in "$PATH"

  • The shell goes to the next line and repeats this process again from the top of this sequence.

Single quotes within double quotes have no effect.

Executing "set -x" in the shell or invoking the shell with "-x" option make the shell to print all of commands executed. This is quite handy for debugging.

Here is a simple script which creates ISO image with RS02 data supplemented by dvdisaster(1).

#!/bin/sh -e
# gmkrs02 : Copyright (C) 2007 Osamu Aoki <osamu@debian.org>, Public Domain
#set -x
error_exit()
{
  echo "$1" >&2
  exit 1
}
# Initialize variables
DATA_ISO="$HOME/Desktop/iso-$$.img"
LABEL=$(date +%Y%m%d-%H%M%S-%Z)
if [ $# != 0 ] && [ -d "$1" ]; then
  DATA_SRC="$1"
else
  # Select directory for creating ISO image from folder on desktop
  DATA_SRC=$(zenity --file-selection --directory  \
    --title="Select the directory tree root to create ISO image") \
    || error_exit "Exit on directory selection"
fi
# Check size of archive
xterm -T "Check size $DATA_SRC" -e du -s $DATA_SRC/*
SIZE=$(($(du -s $DATA_SRC | awk '{print $1}')/1024))
if [ $SIZE -le 520 ] ; then
  zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
    --text="The data size is good for CD backup:\\n $SIZE MB"
elif [ $SIZE -le 3500 ]; then
  zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
    --text="The data size is good for DVD backup :\\n $SIZE MB"
else
  zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
    --text="The data size is too big to backup : $SIZE MB"
  error_exit "The data size is too big to backup :\\n $SIZE MB"
fi
# only xterm is sure to have working -e option
# Create raw ISO image
rm -f "$DATA_ISO" || true
xterm -T "genisoimage $DATA_ISO" \
  -e genisoimage -r -J -V "$LABEL" -o "$DATA_ISO" "$DATA_SRC"
# Create RS02 supplemental redundancy
xterm -T "dvdisaster $DATA_ISO" -e  dvdisaster -i "$DATA_ISO" -mRS02 -c
zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
  --text="ISO/RS02 data ($SIZE MB) \\n created at: $DATA_ISO"
# EOF

You may wish to create launcher on the desktop with command set something like "/usr/local/bin/gmkrs02 %d".

Make is a utility to maintain groups of programs. Upon execution of make(1), make read the rule file, "Makefile", and updates a target if it depends on prerequisite files that have been modified since the target was last modified, or if the target does not exist. The execution of these updates may occur concurrently.

The rule file syntax is the following.

target: [ prerequisites ... ]
 [TAB]  command1
 [TAB]  -command2 # ignore errors
 [TAB]  @command3 # suppress echoing

Here "[TAB]" is a TAB code. Each line is interpreted by the shell after make variable substitution. Use "\" at the end of a line to continue the script. Use "$$" to enter "$" for environment values for a shell script.

Implicit rules for the target and prerequisites can be written, for example, by the following.

%.o: %.c header.h

Here, the target contains the character "%" (exactly one of them). The "%" can match any nonempty substring in the actual target filenames. The prerequisites likewise use "%" to show how their names relate to the actual target name.



Run "make -p -f/dev/null" to see automatic internal rules.

You can set up proper environment to compile programs written in the C programming language by the following.

# apt-get install glibc-doc manpages-dev libc6-dev gcc build-essential

The libc6-dev package, i.e., GNU C Library, provides C standard library which is collection of header files and library routines used by the C programming language.

See references for C as the following.

  • "info libc" (C library function reference)

  • gcc(1) and "info gcc"

  • each_C_library_function_name(3)

  • Kernighan & Ritchie, "The C Programming Language", 2nd edition (Prentice Hall)

Debug is important part of programing activities. Knowing how to debug programs makes you a good Debian user who can produce meaningful bug reports.

Primary debugger on Debian is gdb(1) which enables you to inspect a program while it executes.

Let's install gdb and related programs by the following.

# apt-get install gdb gdb-doc build-essential devscripts

Good tutorial of gdb is provided by "info gdb" or found elsewhere on the web. Here is a simple example of using gdb(1) on a "program" compiled with the "-g" option to produce debugging information.

$ gdb program
(gdb) b 1                # set break point at line 1
(gdb) run args           # run program with args
(gdb) next               # next line
...
(gdb) step               # step forward
...
(gdb) p parm             # print parm
...
(gdb) p parm=12          # set value to 12
...
(gdb) quit
[Tip] Tip

Many gdb(1) commands can be abbreviated. Tab expansion works as in the shell.

Flex is a Lex-compatible fast lexical analyzer generator.

Tutorial for flex(1) can be found in "info flex".

You need to provide your own "main()" and "yywrap()". Otherwise, your flex program should look like this to compile without a library. This is because that "yywrap" is a macro and "%option main" turns on "%option noyywrap" implicitly.

%option main
%%
.|\n    ECHO ;
%%

Alternatively, you may compile with the "-lfl" linker option at the end of your cc(1) command line (like AT&T-Lex with "-ll"). No "%option" is needed in this case.

Several packages provide a Yacc-compatible lookahead LR parser or LALR parser generator in Debian.


Tutorial for bison(1) can be found in "info bison".

You need to provide your own "main()" and "yyerror()". "main()" calls "yyparse()" which calls "yylex()", usually created with Flex.

%%

%%

Autoconf is a tool for producing shell scripts that automatically configure software source code packages to adapt to many kinds of Unix-like systems using the entire GNU build system.

autoconf(1) produces the configuration script "configure". "configure" automatically creates a customized "Makefile" using the "Makefile.in" template.

Although any AWK scripts can be automatically rewritten in Perl using a2p(1), one-liner AWK scripts are best converted to one-liner Perl scripts manually.

Let's think following AWK script snippet.

awk '($2=="1957") { print $3 }' |

This is equivalent to any one of the following lines.

perl -ne '@f=split; if ($f[1] eq "1957") { print "$f[2]\n"}' |
perl -ne 'if ((@f=split)[1] eq "1957") { print "$f[2]\n"}' |
perl -ne '@f=split; print $f[2] if ( $f[1]==1957 )' |
perl -lane 'print $F[2] if $F[1] eq "1957"' |
perl -lane 'print$F[2]if$F[1]eq+1957' |

The last one is a riddle. It took advantage of following Perl features.

  • The whitespace is optional.

  • The automatic conversion exists from number to the string.

See perlrun(1) for the command-line options. For more crazy Perl scripts, Perl Golf may be interesting.

Basic interactive dynamic web pages can be made as follows.

  • Queries are presented to the browser user using HTML forms.

  • Filling and clicking on the form entries sends one of the following URL string with encoded parameters from the browser to the web server.

    • "http://www.foo.dom/cgi-bin/program.pl?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

    • "http://www.foo.dom/cgi-bin/program.py?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

    • "http://www.foo.dom/program.php?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

  • "%nn" in URL is replaced with a character with hexadecimal nn value.

  • The environment variable is set as: "QUERY_STRING="VAR1=VAL1 VAR2=VAL2 VAR3=VAL3"".

  • CGI program (any one of "program.*") on the web server executes itself with the environment variable "$QUERY_STRING".

  • stdout of CGI program is sent to the web browser and is presented as an interactive dynamic web page.

For security reasons it is better not to hand craft new hacks for parsing CGI parameters. There are established modules for them in Perl and Python. PHP comes with these functionalities. When client data storage is needed, HTTP cookies are used. When client side data processing is needed, Javascript is frequently used.

For more, see the Common Gateway Interface, The Apache Software Foundation, and JavaScript.

Searching "CGI tutorial" on Google by typing encoded URL http://www.google.com/search?hl=en&ie=UTF-8&q=CGI+tutorial directly to the browser address is a good way to see the CGI script in action on the Google server.

There are programs to convert source codes.


If you want to make a Debian package, read followings.

There are packages such as dh-make, dh-make-perl, etc., which help packaging.