Chapter 12. Programming

Table of Contents

12.1. The shell script
12.1.1. POSIX shell compatibility
12.1.2. Shell parameters
12.1.3. Shell conditionals
12.1.4. Shell loops
12.1.5. The shell command-line processing sequence
12.1.6. Utility programs for shell script
12.1.7. Shell script dialog
12.1.8. Shell script example with zenity
12.2. Make
12.3. C
12.3.1. Simple C program (gcc)
12.4. Debug
12.4.1. Basic gdb execution
12.4.2. Debugging the Debian package
12.4.3. Obtaining backtrace
12.4.4. Advanced gdb commands
12.4.5. Debugging X Errors
12.4.6. Check dependency on libraries
12.4.7. Memory leak detection tools
12.4.8. Static code analysis tools
12.4.9. Disassemble binary
12.5. Flex — a better Lex
12.6. Bison — a better Yacc
12.7. Autoconf
12.7.1. Compile and install a program
12.7.2. Uninstall program
12.8. Perl short script madness
12.9. Web
12.10. The source code translation
12.11. Making Debian package

I provide some pointers for people to learn programming on the Debian system enough to trace the packaged source code. Here are notable packages and corresponding documentation packages for programming.

Table 12.1. List of packages to help programming

package popcon size documentation
autoconf V:31, I:242 1868 "info autoconf" provided by autoconf-doc
automake V:30, I:237 1710 "info automake" provided by automake1.10-doc
bash V:846, I:999 5798 "info bash" provided by bash-doc
bison V:10, I:110 2061 "info bison" provided by bison-doc
cpp V:393, I:802 42 "info cpp" provided by cpp-doc
ddd V:0, I:13 3965 "info ddd" provided by ddd-doc
exuberant-ctags V:7, I:41 333 exuberant-ctags(1)
flex V:9, I:98 1174 "info flex" provided by flex-doc
gawk V:384, I:494 2199 "info gawk" provided by gawk-doc
gcc V:154, I:599 45 "info gcc" provided by gcc-doc
gdb V:19, I:133 7983 "info gdb" provided by gdb-doc
gettext V:67, I:360 6496 "info gettext" provided by gettext-doc
gfortran V:7, I:64 16 "info gfortran" provided by gfortran-doc (Fortran 95)
fpc I:4 115 fpc(1) and html by fp-docs (Pascal)
glade V:0, I:10 2214 help provided via menu (UI Builder)
libc6 V:933, I:999 10677 "info libc" provided by glibc-doc and glibc-doc-reference
make V:152, I:607 1211 "info make" provided by make-doc
xutils-dev V:1, I:16 1466 imake(1), xmkmf(1), etc.
mawk V:365, I:997 183 mawk(1)
perl V:556, I:995 568 perl(1) and html pages provided by perl-doc and perl-doc-html
python V:627, I:988 648 python(1) and html pages provided by python-doc
tcl V:32, I:434 21 tcl(3) and detail manual pages provided by tcl-doc
tk V:33, I:422 21 tk(3) and detail manual pages provided by tk-doc
ruby V:125, I:333 38 ruby(1) and interactive reference provided by ri
vim V:122, I:392 2470 help(F1) menu provided by vim-doc
susv2 I:0 15 fetch "The Single UNIX Specifications v2"
susv3 I:0 15 fetch "The Single UNIX Specifications v3"

Online references are available by typing "man name" after installing manpages and manpages-dev packages. Online references for the GNU tools are available by typing "info program_name" after installing the pertinent documentation packages. You may need to include the contrib and non-free archives in addition to the main archive since some GFDL documentations are not considered to be DFSG compliant.

[Warning] Warning

Do not use "test" as the name of an executable test file. "test" is a shell builtin.

[Caution] Caution

You should install software programs directly compiled from source into "/usr/local" or "/opt" to avoid collision with system programs.

[Tip] Tip

Code examples of creating "Song 99 Bottles of Beer" should give you good ideas of practically all the programming languages.

The shell script is a text file with the execution bit set and contains the commands in the following format.

#!/bin/sh
 ... command lines

The first line specifies the shell interpreter which read and execute this file contents.

Reading shell scripts is the best way to understand how a Unix-like system works. Here, I give some pointers and reminders for shell programming. See "Shell Mistakes" (http://www.greenend.org.uk/rjk/2001/04/shell.html) to learn from mistakes.

Unlike shell interactive mode (see Section 1.5, “The simple shell command” and Section 1.6, “Unix-like text processing”), shell scripts frequently use parameters, conditionals, and loops.

Each command returns an exit status which can be used for conditional expressions.

  • Success: 0 ("True")

  • Error: non 0 ("False")

[Note] Note

"0" in the shell conditional context means "True", while "0" in the C conditional context means "False".

[Note] Note

"[" is the equivalent of the test command, which evaluates its arguments up to "]" as a conditional expression.

Basic conditional idioms to remember are the following.

  • "<command> && <if_success_run_this_command_too> || true"

  • "<command> || <if_not_success_run_this_command_too> || true"

  • A multi-line script snippet as the following

if [ <conditional_expression> ]; then
 <if_success_run_this_command>
else
 <if_not_success_run_this_command>
fi

Here trailing "|| true" was needed to ensure this shell script does not exit at this line accidentally when shell is invoked with "-e" flag.



Arithmetic integer comparison operators in the conditional expression are "-eq", "-ne", "-lt", "-le", "-gt", and "-ge".

The shell processes a script roughly as the following sequence.

  • The shell reads a line.

  • The shell groups a part of the line as one token if it is within "…" or '…'.

  • The shell splits other part of a line into tokens by the following.

    • Whitespaces: <space> <tab> <newline>

    • Metacharacters: < > | ; & ( )

  • The shell checks the reserved word for each token to adjust its behavior if not within "…" or '…'.

    • reserved word: if then elif else fi for in while unless do done case esac

  • The shell expands alias if not within "…" or '…'.

  • The shell expands tilde if not within "…" or '…'.

    • "~" → current user's home directory

    • "~<user>" → <user>'s home directory

  • The shell expands parameter to its value if not within '…'.

    • parameter: "$PARAMETER" or "${PARAMETER}"

  • The shell expands command substitution if not within '…'.

    • "$( command )" → the output of "command"

    • "` command `" → the output of "command"

  • The shell expands pathname glob to matching file names if not within "…" or '…'.

    • * → any characters

    • ? → one character

    • […] → any one of the characters in ""

  • The shell looks up command from the following and execute it.

    • function definition

    • builtin command

    • executable file in "$PATH"

  • The shell goes to the next line and repeats this process again from the top of this sequence.

Single quotes within double quotes have no effect.

Executing "set -x" in the shell or invoking the shell with "-x" option make the shell to print all of commands executed. This is quite handy for debugging.

Here is a simple script which creates ISO image with RS02 data supplemented by dvdisaster(1).

#!/bin/sh -e
# gmkrs02 : Copyright (C) 2007 Osamu Aoki <osamu@debian.org>, Public Domain
#set -x
error_exit()
{
  echo "$1" >&2
  exit 1
}
# Initialize variables
DATA_ISO="$HOME/Desktop/iso-$$.img"
LABEL=$(date +%Y%m%d-%H%M%S-%Z)
if [ $# != 0 ] && [ -d "$1" ]; then
  DATA_SRC="$1"
else
  # Select directory for creating ISO image from folder on desktop
  DATA_SRC=$(zenity --file-selection --directory  \
    --title="Select the directory tree root to create ISO image") \
    || error_exit "Exit on directory selection"
fi
# Check size of archive
xterm -T "Check size $DATA_SRC" -e du -s $DATA_SRC/*
SIZE=$(($(du -s $DATA_SRC | awk '{print $1}')/1024))
if [ $SIZE -le 520 ] ; then
  zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
    --text="The data size is good for CD backup:\\n $SIZE MB"
elif [ $SIZE -le 3500 ]; then
  zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
    --text="The data size is good for DVD backup :\\n $SIZE MB"
else
  zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
    --text="The data size is too big to backup : $SIZE MB"
  error_exit "The data size is too big to backup :\\n $SIZE MB"
fi
# only xterm is sure to have working -e option
# Create raw ISO image
rm -f "$DATA_ISO" || true
xterm -T "genisoimage $DATA_ISO" \
  -e genisoimage -r -J -V "$LABEL" -o "$DATA_ISO" "$DATA_SRC"
# Create RS02 supplemental redundancy
xterm -T "dvdisaster $DATA_ISO" -e  dvdisaster -i "$DATA_ISO" -mRS02 -c
zenity --info --title="Dvdisaster RS02" --width 640  --height 400 \
  --text="ISO/RS02 data ($SIZE MB) \\n created at: $DATA_ISO"
# EOF

You may wish to create launcher on the desktop with command set something like "/usr/local/bin/gmkrs02 %d".

Make is a utility to maintain groups of programs. Upon execution of make(1), make read the rule file, "Makefile", and updates a target if it depends on prerequisite files that have been modified since the target was last modified, or if the target does not exist. The execution of these updates may occur concurrently.

The rule file syntax is the following.

target: [ prerequisites ... ]
 [TAB]  command1
 [TAB]  -command2 # ignore errors
 [TAB]  @command3 # suppress echoing

Here "[TAB]" is a TAB code. Each line is interpreted by the shell after make variable substitution. Use "\" at the end of a line to continue the script. Use "$$" to enter "$" for environment values for a shell script.

Implicit rules for the target and prerequisites can be written, for example, by the following.

%.o: %.c header.h

Here, the target contains the character "%" (exactly one of them). The "%" can match any nonempty substring in the actual target filenames. The prerequisites likewise use "%" to show how their names relate to the actual target name.



Run "make -p -f/dev/null" to see automatic internal rules.

You can set up proper environment to compile programs written in the C programming language by the following.

# apt-get install glibc-doc manpages-dev libc6-dev gcc build-essential

The libc6-dev package, i.e., GNU C Library, provides C standard library which is collection of header files and library routines used by the C programming language.

See references for C as the following.

  • "info libc" (C library function reference)

  • gcc(1) and "info gcc"

  • each_C_library_function_name(3)

  • Kernighan & Ritchie, "The C Programming Language", 2nd edition (Prentice Hall)

Debug is important part of programing activities. Knowing how to debug programs makes you a good Debian user who can produce meaningful bug reports.

Primary debugger on Debian is gdb(1) which enables you to inspect a program while it executes.

Let's install gdb and related programs by the following.

# apt-get install gdb gdb-doc build-essential devscripts

Good tutorial of gdb is provided by "info gdb" or found elsewhere on the web. Here is a simple example of using gdb(1) on a "program" compiled with the "-g" option to produce debugging information.

$ gdb program
(gdb) b 1                # set break point at line 1
(gdb) run args           # run program with args
(gdb) next               # next line
...
(gdb) step               # step forward
...
(gdb) p parm             # print parm
...
(gdb) p parm=12          # set value to 12
...
(gdb) quit
[Tip] Tip

Many gdb(1) commands can be abbreviated. Tab expansion works as in the shell.

Since all installed binaries should be stripped on the Debian system by default, most debugging symbols are removed in the normal package. In order to debug Debian packages with gdb(1), either corresponding *-dbg packages or *-dbgsym packages need to be installed (e.g. libc6-dbg in the case of libc6, coreutils-dbgsym in the case of coreutils).

Old-style packages would provide its corresponding *-dbg package. It is placed directly inside Debian main archive alongside of the original package itself. For newer packages, they may generate *-dbgsym packages automatically when built and those debug packages are placed separately in debian-debug archive. Please refer to articles on Debian Wiki for more information.

If a package to be debugged does not provide either its *-dbg package or its *-dbgsym package, you need to install it after rebuilding it by the following.

$ mkdir /path/new ; cd /path/new
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install fakeroot devscripts build-essential
$ sudo apt-get build-dep source_package_name
$ apt-get source package_name
$ cd package_name*

Fix bugs if needed.

Bump package version to one which does not collide with official Debian versions, e.g. one appended with "+debug1" when recompiling existing package version, or one appended with "~pre1" when compiling unreleased package version by the following.

$ dch -i

Compile and install packages with debug symbols by the following.

$ export DEB_BUILD_OPTIONS=nostrip,noopt
$ debuild
$ cd ..
$ sudo debi package_name*.changes

You need to check build scripts of the package and ensure to use "CFLAGS=-g -Wall" for compiling binaries.

Flex is a Lex-compatible fast lexical analyzer generator.

Tutorial for flex(1) can be found in "info flex".

You need to provide your own "main()" and "yywrap()". Otherwise, your flex program should look like this to compile without a library. This is because that "yywrap" is a macro and "%option main" turns on "%option noyywrap" implicitly.

%option main
%%
.|\n    ECHO ;
%%

Alternatively, you may compile with the "-lfl" linker option at the end of your cc(1) command line (like AT&T-Lex with "-ll"). No "%option" is needed in this case.

Several packages provide a Yacc-compatible lookahead LR parser or LALR parser generator in Debian.


Tutorial for bison(1) can be found in "info bison".

You need to provide your own "main()" and "yyerror()". "main()" calls "yyparse()" which calls "yylex()", usually created with Flex.

%%

%%

Autoconf is a tool for producing shell scripts that automatically configure software source code packages to adapt to many kinds of Unix-like systems using the entire GNU build system.

autoconf(1) produces the configuration script "configure". "configure" automatically creates a customized "Makefile" using the "Makefile.in" template.

Although any AWK scripts can be automatically rewritten in Perl using a2p(1), one-liner AWK scripts are best converted to one-liner Perl scripts manually.

Let's think following AWK script snippet.

awk '($2=="1957") { print $3 }' |

This is equivalent to any one of the following lines.

perl -ne '@f=split; if ($f[1] eq "1957") { print "$f[2]\n"}' |
perl -ne 'if ((@f=split)[1] eq "1957") { print "$f[2]\n"}' |
perl -ne '@f=split; print $f[2] if ( $f[1]==1957 )' |
perl -lane 'print $F[2] if $F[1] eq "1957"' |
perl -lane 'print$F[2]if$F[1]eq+1957' |

The last one is a riddle. It took advantage of following Perl features.

  • The whitespace is optional.

  • The automatic conversion exists from number to the string.

See perlrun(1) for the command-line options. For more crazy Perl scripts, Perl Golf may be interesting.

Basic interactive dynamic web pages can be made as follows.

  • Queries are presented to the browser user using HTML forms.

  • Filling and clicking on the form entries sends one of the following URL string with encoded parameters from the browser to the web server.

    • "http://www.foo.dom/cgi-bin/program.pl?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

    • "http://www.foo.dom/cgi-bin/program.py?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

    • "http://www.foo.dom/program.php?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

  • "%nn" in URL is replaced with a character with hexadecimal nn value.

  • The environment variable is set as: "QUERY_STRING="VAR1=VAL1 VAR2=VAL2 VAR3=VAL3"".

  • CGI program (any one of "program.*") on the web server executes itself with the environment variable "$QUERY_STRING".

  • stdout of CGI program is sent to the web browser and is presented as an interactive dynamic web page.

For security reasons it is better not to hand craft new hacks for parsing CGI parameters. There are established modules for them in Perl and Python. PHP comes with these functionalities. When client data storage is needed, HTTP cookies are used. When client side data processing is needed, Javascript is frequently used.

For more, see the Common Gateway Interface, The Apache Software Foundation, and JavaScript.

Searching "CGI tutorial" on Google by typing encoded URL http://www.google.com/search?hl=en&ie=UTF-8&q=CGI+tutorial directly to the browser address is a good way to see the CGI script in action on the Google server.

There are programs to convert source codes.


If you want to make a Debian package, read followings.

There are packages such as debmake, dh-make, dh-make-perl, etc., which help packaging.