Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

as(9) [minix man page]

This document describes the language accepted by the 80386 assem-
bler that is part of the Amsterdam Compiler Kit.  Note that  only
the syntax is described, only a few 386 instructions are shown as
examples.  The syntax of numbers is the same as in C.	The  con-
stants	32,  040, and 0x20 all represent the same number, but are
written in decimal, octal, and hex, respectively.  The rules  for
character  constants  and strings are also the same as in C.  For
example, 'a' is  a  character  constant.   A  typical  string  is
"string".   Expressions  may be formed with C operators, but must
use [ and ] for parentheses.  (Normal parentheses are claimed  by
the operand syntax.)  Symbols contain letters and digits, as well
as three special characters: dot,  tilde,  and	underscore.   The
first  character  may  not be a digit or tilde.  The names of the
80386 registers are reserved.  These are: ~~~al, bl, cl, dl
~~~ah, bh, ch, dh
~~~ax, bx, cx, dx, eax, ebx, ecx, edx
~~~si, di, bp, sp, esi, edi, ebp, esp
~~~cs, ds, ss, es, fs, gs The xx and exx variants  of  the  eight
general registers are treated as synonyms by the assembler.  Nor-
mally ";ax" is the 16-bit low half of the 32-bit  "eax"	register.
The  assembler	determines  if	a 16 or 32 bit operation is meant
solely by looking at the instruction or the instruction prefixes.
It  is	however best to use the proper registers when writing as-
sembly to not confuse those who read the code.	The last group of
6  segment registers are used for selector + offset mode address-
ing, in which the effective address is at a given offset  in  one
of  the 6 segments.  Names of instructions and pseudo-ops are not
reserved.  Alphabetic characters in opcodes and  pseudo-ops  must
be  in	lower  case.  Commas, blanks, and tabs are separators and
can be interspersed freely between tokens, but not within tokens.
Commas are only legal between operands.  The comment character is
!.  The rest of the line is ignored.  The opcodes are listed  be-
low.   Notes:  (1)  Different  names for the same instruction are
separated by /.  (2) Square brackets ([]) indicate that 0 or 1 of
the enclosed characters can be included.  (3) Curly brackets ({})
work similarly, except that one of the enclosed  characters  must
be  included.	Thus  square brackets indicate an option, whereas
curly brackets indicate that a choice must be made.

  mov[b]  dest, source	! Move word/byte from source to dest
  pop	  dest		! Pop stack
  push	  source	! Push stack
  xchg[b] op1, op2	! Exchange word/byte
  xlat			! Translate
  o16			! Operate on a 16 bit object instead of 32 bit

  in[b]   source	! Input from source I/O port
  in[b] 		! Input from DX I/O port
  out[b]  dest		! Output to dest I/O port
  out[b]		! Output to DX I/O port

  lds	  reg,source	! Load reg and DS from source
  les	  reg,source	! Load reg and ES from source
  lea	  reg,source	! Load effect address of source to reg and DS
  {cdsefg}seg		! Specify seg register for next instruction
  a16			! Use 16 bit addressing mode instead of 32 bit

  lahf			! Load AH from flag register
  popf			! Pop flags
  pushf 		! Push flags
  sahf			! Store AH in flag register

  aaa			! Adjust result of BCD addition
  add[b]  dest,source	! Add
  adc[b]  dest,source	! Add with carry
  daa			! Decimal Adjust after addition
  inc[b]  dest		! Increment by 1

  aas			! Adjust result of BCD subtraction
  sub[b]  dest,source	! Subtract
  sbb[b]  dest,source	! Subtract with borrow from dest
  das			! Decimal adjust after subtraction
  dec[b]  dest		! Decrement by one
  neg[b]  dest		! Negate
  cmp[b]  dest,source	! Compare

  aam			! Adjust result of BCD multiply
  imul[b] source	! Signed multiply
  mul[b]  source	! Unsigned multiply

  aad			! Adjust AX for BCD division
  o16 cbw		! Sign extend AL into AH
  o16 cwd		! Sign extend AX into DX
  cwde			! Sign extend AX into EAX
  cdq			! Sign extend EAX into EDX
  idiv[b] source	! Signed divide
  div[b]  source	! Unsigned divide

  and[b]  dest,source	! Logical and
  not[b]  dest		! Logical not
  or[b]   dest,source	! Logical inclusive or
  test[b] dest,source	! Logical test
  xor[b]  dest,source	! Logical exclusive or

  sal[b]/shl[b] 	dest,CL! Shift logical left
  sar[b]  dest,CL	! Shift arithmetic right
  shr[b]  dest,CL	! Shift logical right

  rcl[b]  dest,CL	! Rotate left, with carry
  rcr[b]  dest,CL	! Rotate right, with carry
  rol[b]  dest,CL	! Rotate left
  ror[b]  dest,CL	! Rotate right

  cmps[b]		! Compare string element ds:esi with es:edi
  lods[b]		! Load from ds:esi into AL, AX, or EAX
  movs[b]		! Move from ds:esi to es:edi
  rep			! Repeat next instruction until ECX=0
  repe/repz		! Repeat next instruction until ECX=0 and ZF=1
  repne/repnz		! Repeat next instruction until ECX!=0 and ZF=0
  scas[b]		! Compare ds:esi with AL/AX/EAX
  stos[b]		! Store AL/AX/EAX in es:edi

As accepts a number of special jump opcodes that can assemble  to
instructions  with  either  a  byte  displacement, which can only
reach to targets within -126 to +129 bytes of the branch,  or  an
instruction  with a 32-bit displacement.  The assembler automati-
cally chooses a byte or word displacement instruction.	The  Eng-
lish  translation  of  the opcodes should be obvious, with l(ess)
and  g(reater)	for  signed   comparisions,   and   b(elow)   and
a(bove)*(CQ  for  unsigned  comparisions.  There are lots of syn-
onyms to allow you to write ";jump if not that" instead	of  "jump
if  this";.  The call, jmp, and ret instructions can be either in-
trasegment or intersegment.  The intersegment versions are  indi-
cated with the suffix f.

  jmp[f]  dest		! jump to dest (8 or 32-bit displacement)
  call[f] dest		! call procedure
  ret[f]		! return from procedure

  ja/jnbe		! if above/not below or equal (unsigned)
  jae/jnb/jnc		! if above or equal/not below/not carry (uns.)
  jb/jnae/jc		! if not above nor equal/below/carry (unsigned)
  jbe/jna		! if below or equal/not above (unsigned)
  jg/jnle		! if greater/not less nor equal (signed)
  jge/jnl		! if greater or equal/not less (signed)
  jl/jnqe		! if less/not greater nor equal (signed)
  jle/jgl		! if less or equal/not greater (signed)
  je/jz 		! if equal/zero
  jne/jnz		! if not equal/not zero
  jno			! if overflow not set
  jo			! if overflow set
  jnp/jpo		! if parity not set/parity odd
  jp/jpe		! if parity set/parity even
  jns			! if sign not set
  js			! if sign set

  jcxz	  dest		! jump if ECX = 0
  loop	  dest		! Decrement ECX and jump if CX != 0
  loope/loopz		dest! Decrement ECX and jump if ECX = 0 and ZF = 1
  loopne/loopnz 	dest! Decrement ECX and jump if ECX != 0 and ZF = 0

  int	  n		! Software interrupt n
  into			! Interrupt if overflow set
  iretd 		! Return from interrupt

  clc			! Clear carry flag
  cld			! Clear direction flag
  cli			! Clear interrupt enable flag
  cmc			! Complement carry flag
  stc			! Set carry flag
  std			! Set direction flag
  sti			! Set interrupt enable flag

The special symbol . is the location counter and its value is the
address of the first byte of the instruction in which the  symbol
appears and can be used in expressions.  There are four different
assembly segments: text, rom, data and	bss.   Segments  are  de-
clared	and  selected by the .sect pseudo-op.  It is customary to
declare all segments at the top of an assembly	file  like  this:
~~~.sect .text; .sect .rom; .sect .data; .sect .bss The assembler
accepts up to 16 different segments, but expects only four to  be
used.	Anything  can in principle be assembled into any segment,
but the bss segment may only contain  uninitialized  data.   Note
that  the . symbol refers to the location in the current segment.
There are two types: name and numeric.	Name labels consist of	a
name followed by a colon (:).  The numeric labels are single dig-
its.  The nearest 0: label may be referenced as 0f in the forward
direction,  or	0b  backwards.	 Each  line  consists of a single
statement.  Blank or comment lines are allowed.  The most general
form  of  an  instruction  is ~~~label: opcode operand1, operand2
! comment The following operators can be used: + - * / & | ^ ~ <<
(shift	left)  >>  (shift right) - (unary minus).  32-bit integer
arithmetic is used.  Division produces a truncated quotient.  Be-
low  is  a  list  of the addressing modes supported.  Each one is
followed by an example.
  constant		      mov eax, 123456
  direct access 	      mov eax, (counter)
  register		      mov eax, esi
  indirect		      mov eax, (esi)
  base + disp.		      mov eax, 6(ebp)
  scaled index		      mov eax, (4*esi)
  base + index		      mov eax, (ebp)(2*esi)
  base + index + disp.	      mov eax, 10(edi)(1*esi)
Any of the constants or symbols may  be  replacement  by  expres-
sions.	 Direct  access,  constants  and displacements may be any
type of expression.  A scaled index with scale 1 may  be  written
without the 1*.  The call and jmp instructions can be interpreted
as a load into the instruction pointer.
  call _routine 	      ! Direct, intrasegment
  call (subloc) 	      ! Indirect, intrasegment
  call 6(ebp)		      ! Indirect, intrasegment
  call ebx		      ! Direct, intrasegment
  call (ebx)		      ! Indirect, intrasegment
  callf (subloc)	      ! Indirect, intersegment
  callf seg:offs	      ! Direct, intersegment
Symbols can acquire values in one of two ways.	Using a symbol as
a  label  sets it to . for the current segment with type relocat-
able.  Alternative, a symbol may be given a name via  an  assign-
ment  of  the  form    symbol = expression in which the symbol is
assigned the value and type of its arguments.  Space can  be  re-
served	for  bytes, words, and longs using pseudo-ops.	They take
one or more operands, and for each generate a value whose size is
a  byte, word (2 bytes) or long (4 bytes).  For example:   .data1
2, 6	       ! allocate 2 bytes initialized to 2 and 6
  .data2 3, 0x10	! allocate 2 words initialized to  3  and
16
  .data4 010		! allocate a longword initialized to 8
  .space  40		 !  allocates 40 bytes of zeros allocates
50 (decimal) bytes of storage, initializing the first  two  bytes
to  2  and  6,	the next two words to 3 and 16, then one longword
with value 8 (010 octal), last 40 bytes of zeros.  The pseudo-ops
.ascii and .asciz take one string argument and generate the ASCII
character codes for the letters in the string.	The latter  auto-
matically  terminates the string with a null (0) byte.	For exam-
ple,	.ascii "hello"
   .asciz "world
" Sometimes it is necessary to force	the  next
item  to  begin  at  a	word,  longword or even a 16 byte address
boundary.  The .align pseudo-op zero or more  null  byte  if  the
current  location is a multiple of the argument of .align.  Every
item assembled goes in one of the four segments: text, rom, data,
or  bss.  By using the .sect pseudo-op with argument .text, .rom,
.data or .bss, the programmer can force the next items to go in a
particular  segment.   A  symbol can be given global scope by in-
cluding it in a .define pseudo-op.  Multiple names may be listed,
separate by commas.  It must be used to export symbols defined in
the current program.  Names not defined in  the  current  program
are treated as ";undefined external" automatically, although it is
customary to make this explicit with the .extern pseudo-op.   The
.comm  pseudo-op declares storage that can be common to more than
one module.  There are two arguments: a name and an absolute  ex-
pression  giving  the size in bytes of the area named by the sym-
bol.  The type of the symbol becomes external.	The statement can
appear	in  any  segment.   If you think this has something to do
with FORTRAN, you are right.  In the kernel directory, there  are
several  assembly  code  files that are worth inspecting as exam-
ples.  However, note that these files, are designed to	first  be
run through the C preprocessor.  (The very first character is a #
to signal this.)  Thus they contain numerous constructs that  are
not  pure  assembler.  For true assembler examples, compile any C
program provided with using the -S flag.  This will result in  an
assembly  language file with a suffix with the same name as the C
source file, but ending with the .s suffix.
Man Page