Chapter 12. Programming

Table of Contents

12.1. The shell script
12.1.1. POSIX shell compatibility
12.1.2. Shell parameters
12.1.3. Shell conditionals
12.1.4. Shell loops
12.1.5. Shell environment variables
12.1.6. The shell command-line processing sequence
12.1.7. Utility programs for shell script
12.2. Scripting in interpreted languages
12.2.1. Debugging interpreted language codes
12.2.2. GUI program with the shell script
12.2.3. Custom actions for GUI filer
12.2.4. Perl short script madness
12.3. Coding in compiled languages
12.3.1. C
12.3.2. Simple C program (gcc)
12.3.3. Flex — a better Lex
12.3.4. Bison — a better Yacc
12.4. Static code analysis tools
12.5. Debug
12.5.1. Basic gdb execution
12.5.2. Debugging the Debian package
12.5.3. Obtaining backtrace
12.5.4. Advanced gdb commands
12.5.5. Check dependency on libraries
12.5.6. Dynamic call tracing tools
12.5.7. Debugging X Errors
12.5.8. Memory leak detection tools
12.5.9. Disassemble binary
12.6. Build tools
12.6.1. Make
12.6.2. Autotools
12.6.2.1. Compile and install a program
12.6.2.2. Uninstall program
12.6.3. Meson
12.7. Web
12.8. The source code translation
12.9. Making Debian package

I provide some pointers for people to learn programming on the Debian system enough to trace the packaged source code. Here are notable packages and corresponding documentation packages for programming.

Online references are available by typing "man name" after installing manpages and manpages-dev packages. Online references for the GNU tools are available by typing "info program_name" after installing the pertinent documentation packages. You may need to include the contrib and non-free archives in addition to the main archive since some GFDL documentations are not considered to be DFSG compliant.

Please consider to use version control system tools. See Section 10.5, “Git”.

[Warning] Warning

Do not use "test" as the name of an executable test file. "test" is a shell builtin.

[Caution] Caution

You should install software programs directly compiled from source into "/usr/local" or "/opt" to avoid collision with system programs.

[Tip] Tip

Code examples of creating "Song 99 Bottles of Beer" should give you good ideas of practically all the programming languages.

The shell script is a text file with the execution bit set and contains the commands in the following format.

#!/bin/sh
 ... command lines

The first line specifies the shell interpreter which read and execute this file contents.

Reading shell scripts is the best way to understand how a Unix-like system works. Here, I give some pointers and reminders for shell programming. See "Shell Mistakes" (https://www.greenend.org.uk/rjk/2001/04/shell.html) to learn from mistakes.

Unlike shell interactive mode (see Section 1.5, “The simple shell command” and Section 1.6, “Unix-like text processing”), shell scripts frequently use parameters, conditionals, and loops.

Each command returns an exit status which can be used for conditional expressions.

  • Success: 0 ("True")

  • Error: non 0 ("False")

[Note] Note

"0" in the shell conditional context means "True", while "0" in the C conditional context means "False".

[Note] Note

"[" is the equivalent of the test command, which evaluates its arguments up to "]" as a conditional expression.

Basic conditional idioms to remember are the following.

  • "command && if_success_run_this_command_too || true"

  • "command || if_not_success_run_this_command_too || true"

  • A multi-line script snippet as the following

if [ conditional_expression ]; then
 if_success_run_this_command
else
 if_not_success_run_this_command
fi

Here trailing "|| true" was needed to ensure this shell script does not exit at this line accidentally when shell is invoked with "-e" flag.



Arithmetic integer comparison operators in the conditional expression are "-eq", "-ne", "-lt", "-le", "-gt", and "-ge".

The shell processes a script roughly as the following sequence.

  • The shell reads a line.

  • The shell groups a part of the line as one token if it is within "…" or '…'.

  • The shell splits other part of a line into tokens by the following.

    • Whitespaces: space tab newline

    • Metacharacters: < > | ; & ( )

  • The shell checks the reserved word for each token to adjust its behavior if not within "…" or '…'.

    • reserved word: if then elif else fi for in while unless do done case esac

  • The shell expands alias if not within "…" or '…'.

  • The shell expands tilde if not within "…" or '…'.

    • "~" → current user's home directory

    • "~user" → user's home directory

  • The shell expands parameter to its value if not within '…'.

    • parameter: "$PARAMETER" or "${PARAMETER}"

  • The shell expands command substitution if not within '…'.

    • "$( command )" → the output of "command"

    • "` command `" → the output of "command"

  • The shell expands pathname glob to matching file names if not within "…" or '…'.

    • * → any characters

    • ? → one character

    • […] → any one of the characters in ""

  • The shell looks up command from the following and execute it.

    • function definition

    • builtin command

    • executable file in "$PATH"

  • The shell goes to the next line and repeats this process again from the top of this sequence.

Single quotes within double quotes have no effect.

Executing "set -x" in the shell or invoking the shell with "-x" option make the shell to print all of commands executed. This is quite handy for debugging.


When you wish to automate a task on Debian, you should script it with an interpreted language first. The guide line for the choice of the interpreted language is:

  • Use dash, if the task is a simple one which combines CLI programs with a shell program.

  • Use python3, if the task isn't a simple one and you are writing it from scratch.

  • Use perl, tcl, ruby, ... if there is an existing code using one of these languages on Debian which needs to be touched up to do the task.

If the resulting code is too slow, you can rewrite only the critical portion for the execution speed in a compiled language and call it from the interpreted language.

The shell script can be improved to create an attractive GUI program. The trick is to use one of so-called dialog programs instead of dull interaction using echo and read commands.


Here is an example of GUI program to demonstrate how easy it is just with a shell script.

This script uses zenity to select a file (default /etc/motd) and display it.

GUI launcher for this script can be created following Section 9.4.10, “Starting a program from GUI”.

#!/bin/sh -e
# Copyright (C) 2021 Osamu Aoki <osamu@debian.org>, Public Domain
# vim:set sw=2 sts=2 et:
DATA_FILE=$(zenity --file-selection --filename="/etc/motd" --title="Select a file to check") || \
  ( echo "E: File selection error" >&2 ; exit 1 )
# Check size of archive
if ( file -ib "$DATA_FILE" | grep -qe '^text/' ) ; then
  zenity --info --title="Check file: $DATA_FILE" --width 640  --height 400 \
    --text="$(head -n 20 "$DATA_FILE")"
else
  zenity --info --title="Check file: $DATA_FILE" --width 640  --height 400 \
    --text="The data is MIME=$(file -ib "$DATA_FILE")"
fi

This kind of approach to GUI program with the shell script is useful only for simple choice cases. If you are to write any program with complexities, please consider writing it on more capable platform.


Here, Section 12.3.3, “Flex — a better Lex” and Section 12.3.4, “Bison — a better Yacc” are included to indicate how compiler-like program can be written in C language by compiling higher level description into C language.

You can set up proper environment to compile programs written in the C programming language by the following.

# apt-get install glibc-doc manpages-dev libc6-dev gcc build-essential

The libc6-dev package, i.e., GNU C Library, provides C standard library which is collection of header files and library routines used by the C programming language.

See references for C as the following.

  • "info libc" (C library function reference)

  • gcc(1) and "info gcc"

  • each_C_library_function_name(3)

  • Kernighan & Ritchie, "The C Programming Language", 2nd edition (Prentice Hall)

Flex is a Lex-compatible fast lexical analyzer generator.

Tutorial for flex(1) can be found in "info flex".

Many simple examples can be found under "/usr/share/doc/flex/examples/". [7]

Several packages provide a Yacc-compatible lookahead LR parser or LALR parser generator in Debian.


Tutorial for bison(1) can be found in "info bison".

You need to provide your own "main()" and "yyerror()". "main()" calls "yyparse()" which calls "yylex()", usually created with Flex.

Here is an example to create a simple terminal calculator program.

Let's create example.y:

/* calculator source for bison */
%{
#include <stdio.h>
extern int yylex(void);
extern int yyerror(char *);
%}

/* declare tokens */
%token NUMBER
%token OP_ADD OP_SUB OP_MUL OP_RGT OP_LFT OP_EQU

%%
calc:
 | calc exp OP_EQU    { printf("Y: RESULT = %d\n", $2); }
 ;

exp: factor
 | exp OP_ADD factor  { $$ = $1 + $3; }
 | exp OP_SUB factor  { $$ = $1 - $3; }
 ;

factor: term
 | factor OP_MUL term { $$ = $1 * $3; }
 ;

term: NUMBER
 | OP_LFT exp OP_RGT  { $$ = $2; }
  ;
%%

int main(int argc, char **argv)
{
  yyparse();
}

int yyerror(char *s)
{
  fprintf(stderr, "error: '%s'\n", s);
}

Let's create, example.l:

/* calculator source for flex */
%{
#include "example.tab.h"
%}

%%
[0-9]+ { printf("L: NUMBER = %s\n", yytext); yylval = atoi(yytext); return NUMBER; }
"+"    { printf("L: OP_ADD\n"); return OP_ADD; }
"-"    { printf("L: OP_SUB\n"); return OP_SUB; }
"*"    { printf("L: OP_MUL\n"); return OP_MUL; }
"("    { printf("L: OP_LFT\n"); return OP_LFT; }
")"    { printf("L: OP_RGT\n"); return OP_RGT; }
"="    { printf("L: OP_EQU\n"); return OP_EQU; }
"exit" { printf("L: exit\n");   return YYEOF; } /* YYEOF = 0 */
.      { /* ignore all other */ }
%%

Then execute as follows from the shell prompt to try this:

$ bison -d example.y
$ flex example.l
$ gcc -lfl example.tab.c lex.yy.c -o example
$ ./example
$ ./example
1 + 2 * ( 3 + 1 ) =
L: NUMBER = 1
L: OP_ADD
L: NUMBER = 2
L: OP_MUL
L: OP_LFT
L: NUMBER = 3
L: OP_ADD
L: NUMBER = 1
L: OP_RGT
L: OP_EQU
Y: RESULT = 9

exit
L: exit

Lint like tools can help automatic static code analysis.

Indent like tools can help human code reviews by reformatting source codes consistently.

Ctags like tools can help human code reviews by generating an index (or tag) file of names found in source codes.

[Tip] Tip

Configuring your favorite editor (emacs or vim) to use asynchronous lint engine plugins helps your code writing. These plugins are getting very powerful by taking advantage of Language Server Protocol. Since they are moving fast, using their upstream code instead of Debian package may be a good option.


Debug is important part of programming activities. Knowing how to debug programs makes you a good Debian user who can produce meaningful bug reports.


Primary debugger on Debian is gdb(1) which enables you to inspect a program while it executes.

Let's install gdb and related programs by the following.

# apt-get install gdb gdb-doc build-essential devscripts

Good tutorial of gdb can be found:

  • info gdb

  • “Debugging with GDB” in /usr/share/doc/gdb-doc/html/gdb/index.html

  • tutorial on the web

Here is a simple example of using gdb(1) on a "program" compiled with the "-g" option to produce debugging information.

$ gdb program
(gdb) b 1                # set break point at line 1
(gdb) run args           # run program with args
(gdb) next               # next line
...
(gdb) step               # step forward
...
(gdb) p parm             # print parm
...
(gdb) p parm=12          # set value to 12
...
(gdb) quit
[Tip] Tip

Many gdb(1) commands can be abbreviated. Tab expansion works as in the shell.

Since all installed binaries should be stripped on the Debian system by default, most debugging symbols are removed in the normal package. In order to debug Debian packages with gdb(1), *-dbgsym packages need to be installed (e.g. coreutils-dbgsym in the case of coreutils). The source packages generate *-dbgsym packages automatically along with normal binary packages and those debug packages are placed separately in debian-debug archive. Please refer to articles on Debian Wiki for more information.

If a package to be debugged does not provide its *-dbgsym package, you need to install it after rebuilding it by the following.

$ mkdir /path/new ; cd /path/new
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install fakeroot devscripts build-essential
$ apt-get source package_name
$ cd package_name*
$ sudo apt-get build-dep ./

Fix bugs if needed.

Bump package version to one which does not collide with official Debian versions, e.g. one appended with "+debug1" when recompiling existing package version, or one appended with "~pre1" when compiling unreleased package version by the following.

$ dch -i

Compile and install packages with debug symbols by the following.

$ export DEB_BUILD_OPTIONS="nostrip noopt"
$ debuild
$ cd ..
$ sudo debi package_name*.changes

You need to check build scripts of the package and ensure to use "CFLAGS=-g -Wall" for compiling binaries.

When you encounter program crash, reporting bug report with cut-and-pasted backtrace information is a good idea.

The backtrace can be obtained by gdb(1) using one of the following approaches:

For infinite loop or frozen keyboard situation, you can force to crash the program by pressing Ctrl-\ or Ctrl-C or executing “kill -ABRT PID”. (See Section 9.4.12, “Killing a process”)

[Tip] Tip

Often, you see a backtrace where one or more of the top lines are in "malloc()" or "g_malloc()". When this happens, chances are your backtrace isn't very useful. The easiest way to find some useful information is to set the environment variable "$MALLOC_CHECK_" to a value of 2 (malloc(3)). You can do this while running gdb by doing the following.

 $ MALLOC_CHECK_=2 gdb hello

Make is a utility to maintain groups of programs. Upon execution of make(1), make read the rule file, "Makefile", and updates a target if it depends on prerequisite files that have been modified since the target was last modified, or if the target does not exist. The execution of these updates may occur concurrently.

The rule file syntax is the following.

target: [ prerequisites ... ]
 [TAB]  command1
 [TAB]  -command2 # ignore errors
 [TAB]  @command3 # suppress echoing

Here "[TAB]" is a TAB code. Each line is interpreted by the shell after make variable substitution. Use "\" at the end of a line to continue the script. Use "$$" to enter "$" for environment values for a shell script.

Implicit rules for the target and prerequisites can be written, for example, by the following.

%.o: %.c header.h

Here, the target contains the character "%" (exactly one of them). The "%" can match any nonempty substring in the actual target filenames. The prerequisites likewise use "%" to show how their names relate to the actual target name.



Run "make -p -f/dev/null" to see automatic internal rules.

Autotools is a suite of programming tools designed to assist in making source code packages portable to many Unix-like systems.

  • Autoconf is a tool to produce a shell script "configure" from "configure.ac".

    • "configure" is used later to produce "Makefile" from "Makefile.in" template.

  • Automake is a tool to produce "Makefile.in" from "Makefile.am".

  • Libtool is a shell script to address the software portability problem when compiling shared libraries from source code.

The software build system has been evolving:

  • Autotools on the top of Make has been the de facto standard for the portable build infrastructure since 1990s. This is extremely slow.

  • CMake initially released in 2000 improved speed significantly but was originally built on the top of inherently slow Make. (Now Ninja can be its backend.)

  • Ninja initially released in 2012 is meant to replace Make for the further improved build speed and is designed to have its input files generated by a higher-level build system.

  • Meson initially released in 2013 is the new popular and fast higher-level build system which uses Ninja as its backend.

See documents found at "The Meson Build system" and "The Ninja build system".

Basic interactive dynamic web pages can be made as follows.

  • Queries are presented to the browser user using HTML forms.

  • Filling and clicking on the form entries sends one of the following URL string with encoded parameters from the browser to the web server.

    • "https://www.foo.dom/cgi-bin/program.pl?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

    • "https://www.foo.dom/cgi-bin/program.py?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

    • "https://www.foo.dom/program.php?VAR1=VAL1&VAR2=VAL2&VAR3=VAL3"

  • "%nn" in URL is replaced with a character with hexadecimal nn value.

  • The environment variable is set as: "QUERY_STRING="VAR1=VAL1 VAR2=VAL2 VAR3=VAL3"".

  • CGI program (any one of "program.*") on the web server executes itself with the environment variable "$QUERY_STRING".

  • stdout of CGI program is sent to the web browser and is presented as an interactive dynamic web page.

For security reasons it is better not to hand craft new hacks for parsing CGI parameters. There are established modules for them in Perl and Python. PHP comes with these functionalities. When client data storage is needed, HTTP cookies are used. When client side data processing is needed, Javascript is frequently used.

For more, see the Common Gateway Interface, The Apache Software Foundation, and JavaScript.

Searching "CGI tutorial" on Google by typing encoded URL https://www.google.com/search?hl=en&ie=UTF-8&q=CGI+tutorial directly to the browser address is a good way to see the CGI script in action on the Google server.

There are programs to convert source codes.


If you want to make a Debian package, read followings.

There are packages such as debmake, dh-make, dh-make-perl, etc., which help packaging.



[7] Some tweaks may be required to get them work under the current system.