### MIPS Digital Media Extension

### C1 Introduction

The MIPS Digital Media Extension supports video, audio, and graphics pixel processing by introducing vectors of small integers.

The MIPS Digital Media Extension (MDMX) is not a part of the MIPS Instruction Set Architecture (ISA). If a MIPS processor implements the MDMX, that implementation will follow this specification with no supersetting or subsetting. There is no requirement that a MIPS processor implement the MDMX; a processor that implements the MDMX must implement the MIPS-V ISA.

The MIPS MDMX is not intended for general purpose computing. Software support for the MDMX is via shared libraries (DSOs) and assembly language only. Compiler support is neither implied nor planned.

### C 2 Register files

The Digital Media extension shares a register file with the Floating Point Unit. Data is moved between the shared register file and memory with existing Floating Point load and store doubleword operations (LDC1, SDC1, LDXC1, SDXC1, LUXC1, and SUXC1). These operations were extended with MIPS-V to include unaligned (that is, ignore any misaligned) loads and stores. Alignment within a double word is performed by Align and Merge instructions. The DMTC1 and DMFC1 instructions may also be used to move data to and from the integer GPRs.

The registers are interpreted in two new formats: Quad Half (QH) and Oct Byte (OB). In Quad Half format, a 64-bit FPR is interpreted as a vector of 4 signed 16-bit integers. In Oct Byte format, a 64-bit FPR is interpreted as a vector of 8 unsigned 8-bit integers. There is no data format conversion between floating-point and the new formats.

The MDMX also shares the 8 Floating Point Condition Code bits. Unlike the FPU, the MDMX is capable of reading and writing subsets or even all 8 of these bits simultaneously during vector compare and select operations.

The MDMX has a private 192-bit Accumulator register. The format of the Accumulator is determined by the format of the elements accumulated. In QH format, the Accumulator contains 4 48-bit elements; in OB format, the Accumulator contains 8 24-bit elements. Accumulator elements are always signed. The Accumulator cannot be directly loaded from or stored to main memory, but rather must be staged through the shared FP register file.

Digital Media operations always write all 192 bits of the Accumulator or all 64 bits of an FPR, or the condition codes. Results are not stored to multiple destinations (including the condition codes).

### C 3 Exceptions

With the exception of the SHFL instruction, integer vector operations that write to the FPRs clamp the values being written to the target's representable range. Integer vector operations that write to an Accumulator do not clamp their values before writing, but allow underflows and overflows to wrap around the representable range. It is the responsibility of software to ensure that unwanted overflows and underflows do not occur when writing to the Accumulator or FPRs.

### C 4 Instruction Format and VT Selection

The *fmt/sel* field in many integer vector instructions specifies the data format and those elements of vector *vt* which are used with each element of the accumulator *acc*, vector *vs*, or vector *vd*. The format encoding is shown in Table C-1 below. The BW and L formats are reserved for future use.

| fmt/sel   | Format                 |
|-----------|------------------------|
| ssss0     | OB (oct byte)          |
| s s s 0 1 | QH (quad halfword)     |
| s s 0 1 1 | BW (bi word), reserved |
| s s 1 1 1 | L (long), reserved     |

Table C-1 Format Encoding

The part of the field labeled "s" indicates the VT selection for the specified format. Table C-2 describes the VT select encoding:

| fmt/sel   | VT select        |
|-----------|------------------|
| 0 x x x x | element select   |
| 1 0 x x x | select vector    |
| 1 1 x x x | select immediate |

Table C-2 Select Encoding

Element select will select one element in VT and replicate it for every element of VT. For select vector, VT is passed without any modification. For select immediate, the VT field of the instruction opcode is used as an immediate value that is replicated for every element of VT.

The following two tables, Table C-3 and Table C-4, show all valid OB and QH *sel/fmt* encodings and the vector element used. All other encodings are reserved or invalid.

| fmt/sol  | OB Element |   |   |   |     |   |   |   |  |  |  |  |
|----------|------------|---|---|---|-----|---|---|---|--|--|--|--|
| IIII/SCI | Н          | G | F | Е | D   | С | В | А |  |  |  |  |
| 0 000 0  | А          | А | A | A | A   | А | A | А |  |  |  |  |
| 0 001 0  | В          | В | В | В | В   | В | В | В |  |  |  |  |
| 0 010 0  | C          | C | C | C | C   | C | C | C |  |  |  |  |
| 0 011 0  | D          | D | D | D | D   | D | D | D |  |  |  |  |
| 0 100 0  | Е          | Е | Е | Е | Е   | Е | Е | Е |  |  |  |  |
| 0 101 0  | F          | F | F | F | F   | F | F | F |  |  |  |  |
| 0 110 0  | G          | G | G | G | G   | G | G | G |  |  |  |  |
| 0 111 0  | Н          | Н | Н | Н | Н   | Н | Н | Н |  |  |  |  |
| 10 11 0  | Н          | G | F | E | E D |   | В | А |  |  |  |  |
| 11 11 0  | #          | # | # | # | #   | # | # | # |  |  |  |  |

Table C-3 OB Format and Selects

| fmt/col  |   | QH Element |   |   |  |  |  |  |  |
|----------|---|------------|---|---|--|--|--|--|--|
| IIII/SCI | D | C          | В | А |  |  |  |  |  |
| 0 00 01  | А | A          | A | А |  |  |  |  |  |
| 0 01 01  | В | В          | В | В |  |  |  |  |  |
| 0 10 01  | С | C          | C | C |  |  |  |  |  |
| 0 11 01  | D | D          | D | D |  |  |  |  |  |
| 10 1 01  | D | C          | В | А |  |  |  |  |  |
| 11 1 01  | # | #          | # | # |  |  |  |  |  |

Table C-4 QH Format and Selects

Most commonly, elements of vector *vt* are used with the same-numbered elements of the other vector operands, in which case the *fmt/sel* field contains binary 10xxx, and the assembly notation looks like any other vector register, e.g.:

sub.ob \$v4, \$v7, \$v2

However, the *fmt/sel* field can also direct that the second argument to an instruction be a vector of immediates -- copies of the *vt* field interpreted as a five bit unsigned number, like this:

add.qh \$v10, \$v9, 25

An element of vector *vt* can be propagated, to be used with each of the elements of the other vector operand *vs.* Following is the notation to propagate one element of vector *vt* to be used in every element of the computation:

addl.ob \$acc0, \$v4, \$v6[7]

### C 5 Data format conversion

There is no implicit data type conversion from QH to OB or from OB to QH. Both are stored as bit-arrays in memory, however the internal floating-point register formats may differ. QH and OB vectors may be read or written without regard to datatype. Conversion from a bit-array to either a QH or OB occurs during the execution of the first MDMX opcode which includes a format field (e.g., ADD). Subsequent operations must use the same datatype; mixing QH and OB operations without explicit register content conversion results in an undefined operation.

The shuffle (SHFL) and MIN/MAX operators can be used to convert QH and OB vectors. To convert either the lower or upper bytes of an OB vector into a QH vector, the SHF.UPUL and SHF.UPUH can be used to convert the lower [0:3] and upper [4:7] unsigned bytes into signed halves. To convert two QH vectors to an OB

vector, the QH vectors should first be clamped to 0..255, then packed (via SHFL.PACL.OB). Clamping can be done with the MIN.QH and MAX.QH instructions.

### C 6 Description of an Instruction

For the Digital Media instruction documentation, all variable subfields in an instruction format (such as *vs, vt, acc, sel*, and so on) are shown in lower-case. The instruction name (such as ADD, SUB, and so on) is shown in upper-case.

In some instructions, the instruction subfields *op* and *function* can have constant 6and 5-bit values. When reference is made to these instructions, upper-case mnemonics are used. For instance, in the floating-point ADD instruction uses op = COP1 and *function* = ADD. In other cases, a single field has both fixed and variable subfields, so the name contains both upper and lower case characters.

|   |      | 20      |         | fu       | unction ( | for opcod | le = COP2 | 2)       |          |
|---|------|---------|---------|----------|-----------|-----------|-----------|----------|----------|
| b | oits | 0       | 1       | 2        | 3         | 4         | 5         | 6        | 7        |
| Ę | 53   | 000     | 001     | 010      | 011       | 100       | 101       | 110      | 111      |
| 0 | 000  | MSGN    | C.EQ    | PICKF    | PICKT     | C.LT      | C.LE      | MIN      | MAX      |
| 1 | 001  | †       | †       | SUB      | ADD       | AND       | XOR       | OR       | NOR      |
| 2 | 010  | SLL     | †       | SRL      | SRA       | †         | †         | †        | †        |
| 3 | 011  | ALNI.OB | ALNV.OB | ALNI.QH  | ALNV.QH   | †         | †         | †        | SHFL     |
| 4 | 100  | RZU     | RNAU    | RNEU     | †         | RZS       | RNAS      | RNES     | †        |
| 5 | 101  | †       | †       | †        | †         | †         | †         | †        | †        |
| 6 | 110  | MUL     | †       | MULS{,L} | MUL{A,L}  | †         | †         | SUB{A,L} | ADD{A,L} |
| 7 | 111  | †       | †       | †        | †         | t         | †         | WAC      | RAC      |

### C 7 Opcode encoding

**Vector Add** 

## ADD.fmt

| 31                  | 26 | 25             | 21           | 20             |              | 16   | 15 |    | 11 | 10 |    | 6 | 5              |         | 0 |
|---------------------|----|----------------|--------------|----------------|--------------|------|----|----|----|----|----|---|----------------|---------|---|
| COP2<br>0 1 0 0 1 ( | D  | fm             | t/sel        |                | vt           |      | ,  | vs |    |    | vd |   | ADD<br>0 0 1 0 | )<br>11 |   |
| 6                   |    | Ę              | 5            |                | 5            |      |    | 5  |    |    | 5  |   | 6              |         |   |
| _                   |    |                |              |                |              |      |    |    |    |    |    |   |                |         |   |
| Format:             |    | ADD.G<br>ADD.C | ΩH v<br>)B v | d, vs<br>d, vs | , vt<br>, vt |      |    |    |    |    |    |   | N              | IDM     | X |
| Purpose:            |    | Го add         | l intege     | er veo         | ctors.       |      |    |    |    |    |    |   |                |         |   |
| Descriptions        |    | . III.         |              |                |              | س ام |    |    |    |    |    |   |                |         |   |

Description:  $vd[i] \leftarrow vs[i]$ +select(i,sel,vt)

The values in vector *vt* are added to the values in vector *vs*. Saturated arithmetic is performed, such that overflows and underflows clamp to the largest or smallest representable value before writing to vector *vd*.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-3 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

StoreFPR (vd, fmt, Clamp(FGR[vs] + FGR[vt]))

**Exceptions:** 

|                |                  |              |                  |       |         |       |      |        |      | ALL  | u  |                     | Au  |
|----------------|------------------|--------------|------------------|-------|---------|-------|------|--------|------|------|----|---------------------|-----|
| 31             | 26 25            | 21           | 20               | 16    | 15      |       | 11   | 10     | 9    |      | 6  | 5                   | 0   |
| COP2<br>010010 | fmt/             | /sel         | vt               |       | ,       | vs    |      | L<br>0 |      | 0    |    | ADDA<br>1 1 0 1 1 1 |     |
| 6              | 5                |              | 5                |       |         | 5     |      | 1      | 4    |      |    | 6                   |     |
| Format:        | ADDA.(<br>ADDA.( | QH V<br>OB V | vs, vt<br>vs, vt |       |         |       |      |        |      |      |    | MDMX                | K   |
| Purpose:       | To add           | intege       | er vector        | ſS.   |         |       |      |        |      |      |    |                     |     |
| Description:   | acc[i] ←         | - acc        | [i]+vs[i]+       | seled | ct(i,se | l,∨t) |      |        |      |      |    |                     |     |
| The value      | s in vector      | vt an        | d vector         | vs ar | e add   | ed to | o th | lose   | in t | he A | Ac | cumulator. Wraj     | ppe |

The values in vector *vt* and vector *vs* are added to those in the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of vt[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

 $\Delta DD\Delta fmt$ 

StoreACC (acc, fmt, Wrap(ValueACC(acc, fmt) + FGR[vs] + FGR[vt]))

Exceptions:

#### Load Vector Add

## **ADDL.fmt**

| 31 2                | 6 25               | 21    | 20               |        | 16    | 15  | 11 | 10     | 9 | 6 | 5   |                   | 0 |
|---------------------|--------------------|-------|------------------|--------|-------|-----|----|--------|---|---|-----|-------------------|---|
| COP2<br>0 1 0 0 1 0 | fmt/se             | el    |                  | vt     |       | V   | 6  | L<br>1 |   | 0 | 1 1 | ADDA<br>I 0 1 1 1 |   |
| 6                   | 5                  |       |                  | 5      |       | 5   | 5  | 1      |   | 4 |     | 6                 |   |
| Format:             | ADDL.QH<br>ADDL.OB | v     | rs, vt<br>rs, vt |        |       |     |    |        |   |   |     | MDM               | X |
| Purpose:            | To add in          | tege  | er ve            | ctors  | 5.    |     |    |        |   |   |     |                   |   |
| Description:        | acc[i] ← v         | s[i]· | ⊦sele            | ct(i,s | sel,v | rt) |    |        |   |   |     |                   |   |

The values in vector *vt* and vector *vs* are added to those in the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

StoreACC (acc, fmt, FGR[vs] + FGR[vt])

#### Exceptions:

Vector align, Constant Alignment

| 31 2                | 26 25 24                   | 23 21                  | 20 16                                            | 15 1                  | 1 10 | 6 | 5 0                     |
|---------------------|----------------------------|------------------------|--------------------------------------------------|-----------------------|------|---|-------------------------|
| COP2<br>0 1 0 0 1 0 | 0                          | imm                    | vt                                               | vs                    | vd   |   | ALNI.fmt<br>0 1 1 0 x 0 |
| 6                   | 2                          | 3                      | 5                                                | 5                     | 5    |   | 6                       |
| Format:<br>Purpose: | ALNI.(<br>ALNI.(<br>To per | QH v<br>DB v<br>form a | d, vs, vt, imn<br>d, vs, vt, imn<br>byte-wise fu | า<br>า<br>nnel shift. |      |   | MDMX                    |

Description:  $vd \leftarrow ByteAlign(imm_{2..0}, vs, vt)$ 

The align amount is computed by masking the immediate, then using that value to control a funnel shift of vector *vs* concatenated with vector *vt*.

This operation is a media unit operation, and so no data-dependent exceptions are possible.

The operands must be a value in QH or OB format. If not, the results are undefined and the values of the operand vectors become undefined.

This operation does not interpret the format of the registers specified.

Operation:

**ALNI.fmt** 

```
\begin{array}{l} s \leftarrow {imm_{2..0}}^* 8 \\ \text{if BigEndianCPU then} \\ vd \leftarrow (vs \mid\mid vt)_{127\text{-s..64-s}} \\ \text{else} \\ vd \leftarrow (vs \mid\mid vt)_{63\text{+s..s}} \\ \text{endif} \end{array}
```

Exceptions:

Vector Align, Variable Alignment

## **ALNV.fmt**

**MDMX** 

| 31 26          | 25 21 | 20 16 | 15 11 | 10 6 | 5 0                     |
|----------------|-------|-------|-------|------|-------------------------|
| COP2<br>010010 | rs    | vt    | VS    | vd   | ALNV.fmt<br>0 1 1 0 x 1 |
| 6              | 5     | 5     | 5     | 5    | 6                       |
|                |       |       |       |      |                         |

| Format:  | ALNV.QH vd, vs, vt, rs               |
|----------|--------------------------------------|
|          | ALNV.OB vd, vs, vt, rs               |
| Purpose: | To perform a byte-wise funnel shift. |

Description:  $vd \leftarrow ByteAlign(rs_{2,0}, vs, vt)$ 

The align amount is computed by masking the contents of GPR *rs*, then using that value to control a funnel shift of vector *vs* concatenated with vector *vt*.

This operation is a media unit operation, and so no data-dependent exceptions are possible.

The operands must be a value in QH or OB format. If not, the results are undefined and the values of the operand vectors become undefined.

This operation does not interpret the format of the registers specified.

#### Operation:

```
\begin{array}{l} s \leftarrow GPR[rs]_{2..0}{}^*8 \\ \text{if BigEndianCPU then} \\ vd \leftarrow (vs \mid\mid vt)_{127\text{-s..64-s}} \\ \text{else} \\ vd \leftarrow (vs \mid\mid vt)_{63\text{+s..s}} \\ \text{endif} \end{array}
```

#### Exceptions:

|                     |         |            |                          |      |     |    |    |   | 10010              |   |
|---------------------|---------|------------|--------------------------|------|-----|----|----|---|--------------------|---|
| 31 2                | 26 25   | 21         | 20                       | 16   | 15  | 11 | 10 | 6 | 5                  | 0 |
| COP2<br>0 1 0 0 1 0 | fmt/    | sel        | vt                       |      | VS  |    | vd |   | AND<br>0 0 1 1 0 0 |   |
| 6                   | 5       |            | 5                        |      | 5   |    | 5  |   | 6                  |   |
| Format:             | AND.Q   | H v<br>B v | vd, vs, vt<br>vd, vs, vt |      |     |    |    |   | MDM                | X |
| Purpose:            | To do a | bitwi      | se logica                | l AN | ID. |    |    |   |                    |   |
|                     |         |            |                          |      |     |    |    |   |                    |   |

Description:  $vd[i] \leftarrow vs[i]$  AND select(i,sel,vt)

Each element of vector *vs* is combined with the corresponding element of vector *vt* in a bitwise logical AND operation.

The operands and results are values in integer vector format *fmt*. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

AND\_fmt

StoreFPR(fd, fmt, ValueFPR(fs,fmt) and ValueFPR(ft,fmt))

#### Exceptions:

# Description:

Format:

Purpose:

**Vector Compare** 

COP2

010010

6

31

26 25

21 20

vt

5

To perform vector comparison.

cc[i] ← vs[i] cond select(i,sel,vt)

fmt/sel

5

C.cond.QH vs, vt C.cond.OB vs, vt

The values in vector *vt* are compared to the values in vector *vs*, and the result is written to the condition codes. In OB format, all 8 CC bits are set. In QH format, *cc* bits 0 through 3 are written, and *cc* bits 4 through 7 are unaffected.

16 15

vs

5

The comparisons available are less than (LT), less than or equal (LE), and equal (EQ). The inverse comparisons (GE, GT, NE) are not necessary; the instructions that use condition codes (BC1F, BC1T, MOVF, MOVT, PICKF, PICKT) all allow both cc=0 and cc=1 tests. Both LT and LE comparisons are necessary since the operands are not symmetrical — every element of vector *vs* is used, whereas *sel* selects the values of *vt*[] used for each i.

The operands are values in integer vector format *fmt*. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

Exceptions:

Coprocessor Unusable Reserved Instruction C.cond

000xxx

6

**MDMX** 

0

65

11 10

0

5

| 31                  | 26 25          | 5 21                        | 20                                    | 16   | 15  | 11 | 10 | 6 | 5 0                |
|---------------------|----------------|-----------------------------|---------------------------------------|------|-----|----|----|---|--------------------|
| COP2<br>0 1 0 0 1 0 | )              | fmt/sel                     | vt                                    |      | VS  |    | vd |   | MAX<br>0 0 0 1 1 1 |
| 6                   |                | 5                           | 5                                     |      | 5   |    | 5  |   | 6                  |
| Format:<br>Purpose: | MA<br>MA<br>To | AX.QH<br>AX.OB<br>perform v | vd, vs, vt<br>vd, vs, vt<br>ector max | kimi | ım. |    |    |   | MDMX               |

Description:  $vd[i] \leftarrow max(vs[i], select(i, sel, vt))$ 

The values in vector *vt* are compared to the values in vector *vs*, and the larger is written to each element of vector *vd*.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

**MAX.fmt** 

Exceptions:

#### **Vector Minimum**

## **MIN.fmt**

| 31                | 26 | 25             | 21       | 20               | 16           | 15  | 11 | 10 | 6 | 5                | 0   |
|-------------------|----|----------------|----------|------------------|--------------|-----|----|----|---|------------------|-----|
| COP2<br>0 1 0 0 1 | 0  | fn             | nt/sel   |                  | vt           | vs  |    | V  | b | MIN<br>0 0 0 1 1 | 0   |
| 6                 |    |                | 5        |                  | 5            | 5   |    | 5  | 5 | 6                |     |
| Format:           |    | MIN.C<br>MIN.C | QH<br>DB | vd, vs<br>vd, vs | , vt<br>, vt |     |    |    |   | M                | DMX |
| Purpose:          | ,  | То ре          | rform v  | ector            | minimu       | ım. |    |    |   |                  |     |

Description:  $vd[i] \leftarrow min(vs[i], select(i, sel, vt))$ 

The values in vector *vt* are compared to the values in vector *vs*, and the smaller is written to each element of vector *vd*.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

Exceptions:

|                   |    |          |                |      |    |    |    |    |    |    |    |   |   |                     |   | - |
|-------------------|----|----------|----------------|------|----|----|----|----|----|----|----|---|---|---------------------|---|---|
| 31                | 26 | 25       | 2              | 1 20 |    | 16 | 15 |    | 11 | 10 |    | 6 | 5 |                     | 0 |   |
| COP2<br>0 1 0 0 1 | 0  | f<br>x x | mt/sel<br>x 01 |      | vt |    |    | vs |    |    | vd |   | C | MSGN<br>0 0 0 0 0 0 |   |   |
| 6                 |    |          | 5              |      | 5  |    |    | 5  |    |    | 5  |   |   | 6                   |   |   |

| Format:  | MSGN.QH vd, vs, vt                                | MDMX |
|----------|---------------------------------------------------|------|
| Purpose: | To multiply sign bits from one vector by another. |      |

Description:  $vd[i] \leftarrow (vs[i] < 0)$  ? -select(i,sel,vt) : ((vs[i] = 0) ? 0 : select(i,sel,vt))

The values in vector *vt* are multiplied by the sign of the values in vector *vs*, and the result is written to vector *vd*. If an element of vector *vs* is zero, the corresponding element of vector *vd* is set to zero.

Should select(i,sel,*vt*) be the maximum negative value (- $2^{15}$ ), and *vs*[i] < 0, then -select(i,sel,*vt*) will overflow and be clamped to the maximum positive value ( $2^{15}$  - 1).

The operands are values in integer vector format QH. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in format QH. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

**MSGN.fmt** 

Exceptions:

#### Vector Multiply

### **MUL.fmt**

| 31                  | 26     | 25               | 21     | 20     | 16        | 15  |    | 11 | 10  |    | 6 | 5                  | 0 |
|---------------------|--------|------------------|--------|--------|-----------|-----|----|----|-----|----|---|--------------------|---|
| COP2<br>0 1 0 0 1 ( | C      | fmt/s            | el     |        | vt        |     | VS |    |     | vd |   | MUL<br>1 1 0 0 0 0 |   |
| 6                   |        | 5                |        |        | 5         |     | 5  |    |     | 5  |   | 6                  |   |
|                     |        |                  |        |        |           |     |    |    |     |    |   |                    |   |
| Format:             | r<br>r | MUL.QH<br>MUL.OB |        |        |           |     |    |    | MDN | ΛX |   |                    |   |
| Purpose:            | ]      | Го multi         | ply i  | ntege  | r vecto   | rs. |    |    |     |    |   |                    |   |
| Description:        | ١      | vd[i] ← \        | /s[i]* | select | t(i,sel,v | t)  |    |    |     |    |   |                    |   |

The values in vector *vt* are multiplied by the values in vector *vs*, and the product is written into vector *vd*. Saturated arithmetic is performed, such that overflows and underflows clamp to the largest or smallest representable value before writing to vector *vd*.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

StoreFPR (vd, fmt, Clamp(FGR[vs] \* FGR[vt]))

**Exceptions:** 

Coprocessor Unusable

Accumulate Vector Multiply

| 31                  | 26 2           | 5 21                            | 20                        | 16       | 15    |       | 11   | 10     | 9  |      | 6   | 5                   | 0 |
|---------------------|----------------|---------------------------------|---------------------------|----------|-------|-------|------|--------|----|------|-----|---------------------|---|
| COP2<br>0 1 0 0 1 0 |                | fmt/sel                         |                           | vt       |       | VS    |      | L<br>0 |    | 0    |     | MULA<br>1 1 0 0 1 1 |   |
| 6                   |                | 5                               |                           | 5        |       | 5     |      | 1      |    | 4    |     | 6                   |   |
| Format:<br>Purpose: | MI<br>MI<br>To | JLA.QH<br>JLA.OB<br>9 perform : | vs, vt<br>vs, vt<br>a com | bined rr | ultip | ly-th | nen- | add    | of | inte | ege | MDM<br>er vectors.  | X |

 $Description: \quad acc[i] \leftarrow acc[i]+(vs[i]*select(i,sel,vt))$ 

The values in vector *vt* are multiplied by the values in vector *vs*, and the product is added to the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

**MULA.fmt** 

StoreACC (acc, fmt, Wrap(ValueACC(acc,fmt) + (FGR[vs] \* FGR[vt])))

Exceptions:

Add Vector Multiply to Accumulator

### **MULL.fmt**

| 31                  | 26  | 25         | 21               | 20               | 16      | 15    |        | 11   | 10     | 9  | (     | 6  | 5                   | 0  |
|---------------------|-----|------------|------------------|------------------|---------|-------|--------|------|--------|----|-------|----|---------------------|----|
| COP2<br>0 1 0 0 1 0 | D   | 1          | imt/sel          |                  | vt      |       | VS     |      | L<br>1 |    | 0     |    | MULA<br>1 1 0 0 1 1 |    |
| 6                   | 6 5 |            |                  | 5                |         |       | 5      |      | 1      |    | 4     |    | 6                   |    |
| Format:             | 1   | MUL<br>MUL | L.QH v<br>L.OB v | /s, vt<br>/s, vt |         |       |        |      |        |    |       |    | MDM                 | IX |
| Purpose:            |     | Го р       | erform a         | com              | bined n | nulti | ply-tl | hen- | add    | of | integ | ge | r vectors.          |    |

Description:  $acc[i] \leftarrow vs[i]^*select(i,sel,vt)$ 

The values in vector *vt* are multiplied by the values in vector *vs*, and the product is added to the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

StoreACC (acc, fmt, FGR[vs] \* FGR[vt])

#### Exceptions:

|                     | IMI          |                | Su               | btract | Vect    | or     | Mul    | tiply | from A  | ccumula        | ator        |   |
|---------------------|--------------|----------------|------------------|--------|---------|--------|--------|-------|---------|----------------|-------------|---|
| 31                  | 26 25        | 21             | 20               | 16     | 15      | 11     | 10     | 9     | 6       | 5              | 0           | 1 |
| COP2<br>0 1 0 0 1 0 | fm           | t/sel          | vt               |        | V       | 6      | L<br>0 |       | 0       | MUL<br>1 1 0 0 | _S<br>0 1 0 |   |
| 6                   |              | 5              | 5                |        | 5       | 5      | 1      |       | 4       | 6              |             | - |
| Format:             | MULS<br>MULS | .QH v<br>.OB v | rs, vt<br>rs, vt |        |         |        |        |       |         | Ν              | /IDMX       |   |
| Purpose:            | To per       | form a         | combine          | ed m   | ultiply | -then- | sub    | trac  | t of ir | nteger ved     | ctors.      |   |

Description:  $acc[i] \leftarrow acc[i]-(vs[i]*select(i,sel,vt))$ 

The values in vector *vt* are multiplied by the values in vector *vs*, and the product is subtracted from the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

StoreACC (acc, fmt, Wrap(ValueACC(acc, fmt) - (FGR[vs] \* FGR[vt])))

Exceptions:

Load Negative Vector Multiply

## **MULSL.fmt**

| 31                | 26 | 25         | 21                   | 20             |    | 16 | 15 |    | 11 | 10     | 9 |   | 6 | 5                   | 0  |
|-------------------|----|------------|----------------------|----------------|----|----|----|----|----|--------|---|---|---|---------------------|----|
| COP2<br>0 1 0 0 1 | 0  |            | fmt/sel              |                | vt |    |    | vs |    | L<br>1 |   | 0 |   | MULS<br>1 1 0 0 1 0 |    |
| 6                 |    |            | 5                    |                | 5  |    |    | 5  |    | 1      |   | 4 |   | 6                   |    |
| Format:           |    | MUL<br>MUL | _SL.QH v<br>_SL.OB v | s, vt<br>s, vt |    |    |    |    |    |        |   |   |   | MDM                 | IX |

Purpose: To perform a combined multiply-then-subtract of integer vectors.

Description:  $acc[i] \leftarrow -(vs[i]*select(i,sel,vt))$ 

The values in vector *vt* are multiplied by the values in vector *vs*, and the product is subtracted from the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

StoreACC (acc, fmt, - (FGR[vs] \* FGR[vt]))

Exceptions:

|                |                  |            |                          |      |     |    |    |   | 1000               |    |
|----------------|------------------|------------|--------------------------|------|-----|----|----|---|--------------------|----|
| 31 2           | 26 25            | 21         | 20                       | 16   | 15  | 11 | 10 | 6 | 5                  | 0  |
| COP2<br>010010 | fmt/s            | el         | vt                       |      | VS  |    | vd |   | NOR<br>0 0 1 1 1 1 |    |
| 6              | 5                |            | 5                        |      | 5   |    | 5  |   | 6                  |    |
| Format:        | NOR.QI<br>NOR.OI | Η \<br>3 \ | /d, vs, vt<br>/d, vs, vt |      |     |    |    |   | MDM                | IX |
| Purpose:       | To do a          | bitwi      | ise logica               | I NC | DR. |    |    |   |                    |    |

Description:  $vd[i] \leftarrow vs[i] \text{ NOR select}(i,sel,vt)$ 

Each element of vector *vs* is combined with the corresponding element of vector *vt* in a bitwise logical NOR operation.

The operands and results are values in integer vector format *fmt*. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

NOR fmt

StoreFPR(fd, fmt, ValueFPR(fs,fmt) nor ValueFPR(ft,fmt))

Exceptions:

| Vector | Or |
|--------|----|
|--------|----|

| 31                | 26 | 25   | 21     | 20     |    | 16 | 15 |    | 11 | 10 |    | 6 | 5                 | 0  |
|-------------------|----|------|--------|--------|----|----|----|----|----|----|----|---|-------------------|----|
| COP2<br>0 1 0 0 1 | 0  | fn   | nt/sel |        | vt |    |    | vs |    |    | vd |   | OR<br>0 0 1 1 1 0 |    |
| 6                 |    |      | 5      |        | 5  |    |    | 5  |    |    | 5  |   | 6                 |    |
| Format:           | (  | OR.Q | iH v   | d, vs, | vt |    |    |    |    |    |    |   | MDN               | 1X |

| ormat. | OIX.GIT | vu, vo, vi |
|--------|---------|------------|
|        | OR.OB   | vd, vs, vt |

Purpose: To do a bitwise logical OR.

Description:  $vd[i] \leftarrow vs[i] OR select(i,sel,vt)$ 

Each element of vector *vs* is combined with the corresponding element of vector *vt* in a bitwise logical OR operation.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

StoreFPR(fd, fmt, ValueFPR(fs,fmt) or ValueFPR(ft,fmt))

Exceptions:

| 31          | 26       | 25                                                      | 21     | 20 |    | 16 | 15 |    | 11 | 10 |    | 6 | 5 |                  |    | 0 |
|-------------|----------|---------------------------------------------------------|--------|----|----|----|----|----|----|----|----|---|---|------------------|----|---|
| COP2        | 2<br>1 0 | f                                                       | mt/sel |    | vt |    |    | vs |    |    | vd |   | 0 | PICKF<br>0 0 0 1 | 0  |   |
| 6           |          |                                                         | 5      |    | 5  |    |    | 5  |    |    | 5  |   |   | 6                |    |   |
| Format:     |          | PICKF.QH vd, vs, vt<br>PICKF.OB vd, vs, vt              |        |    |    |    |    |    |    |    |    |   |   | MI               | DМ | X |
| Purpose:    | 1        | To select elements of a vector.                         |        |    |    |    |    |    |    |    |    |   |   |                  |    |   |
| Description | n:       | $vd[i] \leftarrow cc[i] = 0 ? vs[i] : select(i,sel,vt)$ |        |    |    |    |    |    |    |    |    |   |   |                  |    |   |

Depending on the *cc* bits, the vector *vd* is written with either the corresponding element of vector *vs* or the corresponding element of vector *vt*. When operating on OB format data, all 8 *cc* bits are used. When operating on QH format data, *cc* bits 0 through 3 are used.

The operands and results are values in integer vector format *fmt*. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

Both PICKF and PICKT are necessary since the operands are not symmetrical — every element of vector *vs* is used, whereas *sel* selects the values of *vt*[] used for each i.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

**PICKF.fmt** 

Exceptions:

#### Select Vector Elements

## **PICKT.fmt**

| 31            | 26 | 25                              | 21                 | 20                       |    | 16 | 15 |    | 11 | 10 |    | 6 | 5               |          | 0 |
|---------------|----|---------------------------------|--------------------|--------------------------|----|----|----|----|----|----|----|---|-----------------|----------|---|
| COP2<br>01001 | 0  | f                               | mt/sel             |                          | vt |    | ١  | /S |    |    | vd |   | PICK<br>0 0 0 0 | Г<br>1 1 |   |
| 6             |    | •                               | 5                  |                          | 5  |    |    | 5  |    |    | 5  |   | 6               |          |   |
| Format:       |    | PICK<br>PICK                    | (T.QH v<br>(T.OB v | vd, vs, vt<br>vd, vs, vt |    |    |    |    |    |    |    |   | М               | DM       | X |
| Purpose:      | ,  | To select elements of a vector. |                    |                          |    |    |    |    |    |    |    |   |                 |          |   |

 $Description: \quad vd[i] \leftarrow cc[i] = 1 ? vs[i] : select(i, sel, vt)$ 

Depending on the *cc* bit, the vector *vd* is written with either the corresponding element of vector *vs* or the corresponding element of vector *vt*. When operating on OB format data, all 8 *cc* bits are used. When operating on QH format data, *cc* bits 0 through 3 are used.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

Both PICKF and PICKT are necessary since the operands are not symmetrical — every element of vector *vs* is used, whereas *sel* selects the values of *vt*[] used for each i.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

#### Exceptions:

| Rx.fm               | <u>t</u> |                         |        |                  |       | So   | cale | , Rc | ound an  | d C   | lamp     | Accum         | ulator |
|---------------------|----------|-------------------------|--------|------------------|-------|------|------|------|----------|-------|----------|---------------|--------|
| 31                  | 26       | 25                      | 21     | 20               | 16    | 15   |      | 11   | 10       | 6     | 5        |               | 0      |
| COP2<br>0 1 0 0 1 ( | C        | fmt/se                  | el     | vt               |       |      | 0    |      | vd       |       | 10       | Rx<br>0 x x x |        |
| 6                   |          | 5                       |        | 5                |       |      | 5    |      | 5        |       |          | 6             |        |
| Format:             | F        | Rx.QH<br>Rx.OB          | N<br>N | vd, vt<br>vd, vt |       |      |      |      |          |       |          | MDN           | IX     |
| Purpose:            | Т<br>r   | fo scale, r<br>egister. | our    | nd and tl        | nen o | lamı | o an | accu | imulator | 's va | alues ir | nto a vec     | ctor   |

Description:  $vd[i] \leftarrow Clamp(Round(acc[i] >> select(i,sel,vt)))$ 

The values in the Accumulator are shifted right by the values in vector *vt*, rounded by the indicated mode, and clamped to either a signed or unsigned subset of the range of *vd*[]. This is the only instruction type that can do an unsigned quad-half clamp.

The *vt* operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

In the QH format, if an element of vt[] is negative, the corresponding element of vd[] is undefined. If an element of vt[] is greater than 48, all significant bits will be shifted away and the result will be zero. In the OB format, if an element of vt[] is greater than 24, then the result will be zero.

The rounding modes available depend on the format selected, and in the QH format are available in signed and unsigned versions, as shown below:

| Pounding direction                                   | Quad Ha | lf format | Oct Byte format |
|------------------------------------------------------|---------|-----------|-----------------|
|                                                      | Signed  | Unsigned  | Unsigned        |
| all fractional values round<br>toward zero           | RZS.QH  | RZU.QH    | RZU.OB          |
| to nearest, exactly halfway<br>rounds away from zero | RNAS.QH | RNAU.QH   | RNAU.OB         |
| to nearest, exactly halfway<br>rounds to even        | RNES.QH | RNEU.QH   | RNEU.OB         |

Rounding Modes Used in Rx.fmt

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Scale, Round and Clamp Accumulator

# Rx.fmt

Operation:

Exceptions:

| 31             | 26               | 25                                                   | 21                                           | 20               |       | 16    | 15    |      | 11    | 10     |       | 6   | 5           |             | 0  |
|----------------|------------------|------------------------------------------------------|----------------------------------------------|------------------|-------|-------|-------|------|-------|--------|-------|-----|-------------|-------------|----|
| COP2<br>010010 |                  | fmt                                                  | /op                                          |                  | 0     |       |       | 0    |       |        | vd    |     | R/<br>1 1 1 | AC<br>1 1 1 |    |
| 6              |                  | ļ                                                    | 5                                            |                  | 5     |       |       | 5    |       |        | 5     |     |             | 6           |    |
| Format:        | F<br>F<br>F<br>F | RACL.(<br>RACM.<br>RACH.<br>RACL.(<br>RACM.<br>RACH. | QH V<br>QH V<br>QH V<br>OB V<br>OB V<br>OB V | d<br>d<br>d<br>d |       |       |       |      |       |        |       |     |             | MDN         | ſΧ |
| Purpose:       | ]                | To read                                              | l sectio                                     | ons o            | f the | e acc | umu   | lato | r int | o a ve | ector | reg | gister.     |             |    |
| Description:   | v                | /d[i] ←                                              | - acc[i                                      | ].{lov           | v, m  | ed, I | nigh} |      |       |        |       |     |             |             |    |

Read either the least significant, middle significant, or most significant third of the bits of the Accumulator elements. No clamping of the values extracted is performed; the bits are simply copied into elements of vd[].

The field *fmt/op* specifies which of the 8 or 16 bits of the Accumulator to read the following:

| operation | fmt/op |        |  |  |  |  |  |  |  |
|-----------|--------|--------|--|--|--|--|--|--|--|
| operation | OB Fmt | QH Fmt |  |  |  |  |  |  |  |
| RACL      | 0000 0 | 000 01 |  |  |  |  |  |  |  |
| RACM      | 0100 0 | 010 01 |  |  |  |  |  |  |  |
| RACH      | 1000 0 | 100 01 |  |  |  |  |  |  |  |

#### RAC fmt/op Encodings

This operation is a signal processing operation, no data-dependent exceptions are possible.

A RACL/RACM/RACH followed by WACL/WACH are used to save and restore the Accumulator. This save:restore function is format independent, either format can be used to save or restore Accumulator values generated by either QH or OB operations. There is no implied data conversion; the mapping between element bits of the OB format Accumulator and bits of the same Accumulator interpreted in QH format is implementation specific, but consistent for each implementation.

#### Operation:

**RAC**.fmt

Exceptions:

Coprocessor Unusable

#### Vector Element Shuffle

| 31                | 26 | 25  | 21  | 20 |    | 16 | 15 |    | 11 | 10 |    | 6 | 5                   | 0 |
|-------------------|----|-----|-----|----|----|----|----|----|----|----|----|---|---------------------|---|
| COP2<br>0 1 0 0 1 | 2  | fmt | /op |    | vt |    |    | vs |    |    | vd |   | SHFL<br>0 1 1 1 1 1 |   |
| 6                 |    |     | 5   |    | 5  |    |    | 5  |    |    | 5  |   | 6                   |   |
|                   |    |     |     |    |    |    |    |    |    |    |    |   |                     |   |

| Format: | SHFL<br>SHFL | op.QH \<br>op.OB \ | /d, vs, v<br>/d, vs, vt | t<br>t |  |            |    | MDMX |
|---------|--------------|--------------------|-------------------------|--------|--|------------|----|------|
| D       | m            | ,                  |                         | 0.1    |  | <b>c</b> . | .1 |      |

Purpose: To make a new vector of the elements of two other vectors.

Description:  $vd[i] \leftarrow one of vs[j] or vt[j]$ 

Elements of vectors vs and vt and merged into a new vector. All possible value rearrangings are not available -- the operations of the variants of this instruction are tailored to the data movement patterns of specific calculations. The shuffles available in OB and QH formats are given in the tables below.

Note that UPSL.OB and UPSH.OB are the only MU instructions that interpret an element of an OB format vector as a signed quantity.

The operands are values in integer vector format *fmt*. *sel* selects the values of vt[] used for each i. See section C 4 on page C-2 for a description of *fmt* encoding. The remaining bits in the field are not used for a vt[] select but rather are used to encode the shuffle operation.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

| fmt/op | Operation | vd[7]         | vd[6] | vd[5]         | vd[4] | vd[3]         | vd[2] | vd[1]         | vd[0] |
|--------|-----------|---------------|-------|---------------|-------|---------------|-------|---------------|-------|
| 0000 0 | UPUH      | 0             | vs[7] | 0             | vs[6] | 0             | vs[5] | 0             | vs[4] |
| 0001 0 | UPUL      | 0             | vs[3] | 0             | vs[2] | 0             | vs[1] | 0             | vs[0] |
| 0010 0 | UPSH      | sign<br>vs[7] | vs[7] | sign<br>vs[6] | vs[6] | sign<br>vs[5] | vs[5] | sign<br>vs[4] | vs[4] |
| 0011 0 | UPSL      | sign<br>vs[3] | vs[3] | sign<br>vs[2] | vs[2] | sign<br>vs[1] | vs[1] | sign<br>vs[0] | vs[0] |
| 0100 0 | РАСН      | vs[7]         | vs[5] | vs[3]         | vs[1] | vt[7]         | vt[5] | vt[3]         | vt[1] |
| 0101 0 | PACL      | vs[6]         | vs[4] | vs[2]         | vs[0] | vt[6]         | vt[4] | vt[2]         | vt[0] |
| 0110 0 | MIXH      | vs[7]         | vt[7] | vs[6]         | vt[6] | vs[5]         | vt[5] | vs[4]         | vt[4] |
| 0111 0 | MIXL      | vs[3]         | vt[3] | vs[2]         | vt[2] | vs[1]         | vt[1] | vs[0]         | vt[0] |

| Oct Byte | Shuffles |
|----------|----------|
|----------|----------|

MDMX

# SHFL.op.fmt

| Quad Half shuffles |           |       |       |       |       |  |  |  |  |  |
|--------------------|-----------|-------|-------|-------|-------|--|--|--|--|--|
| fmt/op             | Operation | vd[3] | vd[2] | vd[1] | vd[0] |  |  |  |  |  |
| 000 01             | MIXH      | vs[3] | vt[3] | vs[2] | vt[2] |  |  |  |  |  |
| 001 01             | MIXL      | vs[1] | vt[1] | vs[0] | vt[0] |  |  |  |  |  |
| 010 01             | РАСН      | vs[3] | vs[1] | vt[3] | vt[1] |  |  |  |  |  |
| 011 01             | PACL      | vs[2] | vs[0] | vt[2] | vt[0] |  |  |  |  |  |
| 100 01             | BFLA      | vs[2] | vt[3] | vs[0] | vt[1] |  |  |  |  |  |
| 101 01             | BFLB      | vs[0] | vt[1] | vs[2] | vt[3] |  |  |  |  |  |
| 110 01             | REPA      | vs[3] | vs[2] | vt[3] | vt[2] |  |  |  |  |  |
| 111 01             | REPB      | vs[1] | vs[0] | vt[1] | vt[0] |  |  |  |  |  |

Operation:

Exceptions:

#### Vector Shift Left Logical

### SLL.fmt

| 31                  | 26 | 25                       | 21                        | 20                      |                      | 16   | 15   |      | 11    | 10    |      | 6  | 5                | 0   |
|---------------------|----|--------------------------|---------------------------|-------------------------|----------------------|------|------|------|-------|-------|------|----|------------------|-----|
| COP2<br>0 1 0 0 1 0 | C  | fm                       | nt/sel                    |                         | vt                   |      |      | VS   |       |       | vd   |    | SLL<br>0 1 0 0 0 | 0 0 |
| 6                   |    |                          | 5                         |                         | 5                    |      |      | 5    |       |       | 5    |    | 6                |     |
| Format:<br>Purpose: |    | SLL.C<br>SLL.C<br>To shi | QH v<br>DB v<br>ift a vec | d, vs<br>d, vs<br>tor's | , vt<br>, vt<br>elen | nent | s by | a va | riabl | le nu | mber | of | M                | DMX |

 $Description: \quad vd[i] \leftarrow vs[i] << select(i, sel, vt)$ 

Each element of vector *vs* is shifted left by an amount specified by the corresponding element of vector *vt*, and zeros are shifted into the low-order bits. The results are written into vector *vd*. In QH format, all but the lower 4 bits of the shift amount are masked to zero; the largest shift possible is 15 places. In OB format, all but the lower 3 bits of the shift amount are masked to zero; the shift amount are masked to zero; the largest shift possible is 15 places.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

Exceptions:

| 31                | 26 | 25               | 21        | 20 |    | 16 | 15 |    | 11 | 10 |    | 6 | 5                  | 0 |
|-------------------|----|------------------|-----------|----|----|----|----|----|----|----|----|---|--------------------|---|
| COP2<br>0 1 0 0 1 | 0  | fmt/s<br>x x x ( | el<br>) 1 |    | vt |    |    | vs |    |    | vd |   | SRA<br>0 1 0 0 1 1 |   |
| 6                 |    | 5                |           |    | 5  |    |    | 5  |    |    | 5  |   | 6                  |   |
|                   |    |                  |           |    |    |    |    |    |    |    |    |   |                    |   |

| Format:  | SRA.QH vd, vs, vt                   | MDMX |
|----------|-------------------------------------|------|
| Purpose: | To arithmetic right shift a vector. |      |

Description:  $vd[i] \leftarrow vs[i] >> select(i,sel,vt)$ 

Each element of vector *vs* is shifted right by an amount specified by the corresponding element of vector *vt*. The high-order bits are filled with copies of the original sign bit. The results are written into vector *vd*. All but the lower 4 bits of the shift amount are masked to zero; the largest shift possible is 15 places. This operation is undefined for the OB format, since values in that format are unsigned.

The operands and results are values in integer vector format QH. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the QH format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

SRA<sub>.</sub>fmt

Exceptions:

#### Vector Shift Right Logical

### SRL.fmt

| 31                  | 26 | 25                           | 21                 | 20                          |                  | 16   | 15     | 11     | 10    |      | 6    | 5                  | 0  |
|---------------------|----|------------------------------|--------------------|-----------------------------|------------------|------|--------|--------|-------|------|------|--------------------|----|
| COP2<br>0 1 0 0 1 0 | )  | fmt/s                        | el                 |                             | vt               |      | V      | 5      |       | vd   |      | SRL<br>0 1 0 0 1 0 |    |
| 6                   |    | 5                            |                    |                             | 5                | ·    | Į      | 5      | •     | 5    |      | 6                  |    |
| Format:<br>Purpose: |    | SRL.Q⊦<br>SRL.OE<br>To shift | ł v<br>8 v<br>avec | d, vs,<br>d, vs,<br>tor's ( | vt<br>vt<br>elem | ents | s by a | variał | ole n | umbe | r of | MDN                | ЛX |

Description:  $vd[i] \leftarrow vs[i] >> select(i,sel,vt)$ 

Each element of vector *vs* is shifted right by an amount specified by the corresponding element of vector *vt*, and zeros are shifted into the high-order bits. The results are written into vector *vd*. In QH format, all but the lower 4 bits of the shift amount are masked to zero; the largest shift possible is 15 places. In OB format, all but the lower 3 bits of the shift amount are masked to zero; the largest shift or zero; the largest possible shift is 7 places.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

Exceptions:

|   | 31                  | 26   | 25           | 21           | 20               |                | 16    | 15 |    | 11 | 10 |    | 6 | 5           |             | 0  |
|---|---------------------|------|--------------|--------------|------------------|----------------|-------|----|----|----|----|----|---|-------------|-------------|----|
|   | COP2<br>0 1 0 0 1 0 | )    | fr           | nt/sel       |                  | vt             |       |    | vs |    |    | vd |   | SI<br>0 0 1 | JB<br>0 1 0 |    |
|   | 6                   | •    |              | 5            |                  | 5              |       |    | 5  |    |    | 5  |   |             | 6           |    |
| ] | Format:             | 0,00 | SUB.<br>SUB. | QH v<br>OB v | rd, vs<br>rd, vs | s, vt<br>s, vt |       |    |    |    |    |    |   |             | MDM         | IX |
| ] | Purpose:            | 1    | lo su        | btract ir    | itege            | r ve           | ctors | 5. |    |    |    |    |   |             |             |    |
| ] | Description:        | v    | /d[i] ∢      | – vs[i]-:    | selec            | t(i,s          | el,vt | )  |    |    |    |    |   |             |             |    |

The difference of the values in vector *vt* and vector *vs* are written into vector *vd*. Saturated arithmetic is performed, such that overflows and underflows clamp to

Saturated arithmetic is performed, such that overflows and underflows clamp to the largest or smallest representable value before writing to vector *vd*.

The operands and results are values in integer vector format *fmt*. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

SUB.fmt

StoreFPR (vd, fmt, Clamp(FGR[vs] - FGR[vt]))

Exceptions:

**Accumulate Vector Difference** 

## SUBA.fmt

| 31                | 26 | 25         | 21               | 20               |    | 16 | 15 |    | 11 | 10     | 9 |   | 6 | 5                   | 0  |
|-------------------|----|------------|------------------|------------------|----|----|----|----|----|--------|---|---|---|---------------------|----|
| COP2<br>0 1 0 0 1 | 0  | 1          | imt/sel          |                  | vt |    |    | vs |    | L<br>0 |   | 0 |   | SUBA<br>1 1 0 1 1 0 |    |
| 6                 |    |            | 5                |                  | 5  |    |    | 5  |    | 1      |   | 4 |   | 6                   |    |
| Format:           |    | SUB<br>SUB | A.QH v<br>A.OB v | rs, vt<br>rs, vt |    |    |    |    |    |        |   |   |   | MDM                 | IX |

Purpose: To subtract integer vectors and accumulate the difference.

Description:  $acc[i] \leftarrow acc[i]+vs[i]-select(i,sel,vt)$ 

The differences of vector *vt* and vector *vs* are added to those in the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

If *L* is 1 then the Accumulator is cleared to zero before the operation.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

#### Operation:

StoreACC (acc, fmt, Wrap(ValueACC(acc, fmt) + FGR[vs] - FGR[vt]))

#### Exceptions:

| 31                  | 26 | 25     | 21        | 20     |        | 16    | 15  |    | 11 | 10     | 9 |   | 6 | 5 |                   | 0  |
|---------------------|----|--------|-----------|--------|--------|-------|-----|----|----|--------|---|---|---|---|-------------------|----|
| COP2<br>0 1 0 0 1 0 | )  | fm     | nt/sel    |        | vt     |       |     | vs |    | L<br>1 |   | 0 |   | 1 | SUBA<br>1 0 1 1 0 |    |
| 6                   |    |        | 5         |        | 5      |       |     | 5  |    | 1      |   | 4 |   |   | 6                 |    |
| Format:             | S  | SUBL   | .QH v     | rs, vt |        |       |     |    |    |        |   |   |   |   | MDM               | IX |
|                     | S  | SUBL   | .OB v     | ′s, vt |        |       |     |    |    |        |   |   |   |   |                   |    |
| Purpose:            | Т  | 'o sul | otract ir | ntege  | r ve   | ctors | 5.  |    |    |        |   |   |   |   |                   |    |
| Description:        | а  | icc[i] | ← vs[i]   | -sele  | ect(i, | sel,v | rt) |    |    |        |   |   |   |   |                   |    |

The differences of vector *vt* and vector *vs* are added to those in the Accumulator. Wrapped arithmetic is performed, such that overflows and underflows wrap around the Accumulator's representable range before being written into the Accumulator.

The operands are values in integer vector format *fmt*. The Accumulator is in the corresponding Accumulator vector format. *sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

If *L* is 1 then the Accumulator is cleared to zero before the operation.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

SUBL.fmt

```
StoreACC (acc, fmt, Wrap(FGR[vs] - FGR[vt]))
```

Exceptions:

Write Accumulator High

## WACH.fmt

| 31                | 26  | 25     | 21         | 20         |        | 16 | 15       | 11    | 10   | 6      | 5                  | 0  |
|-------------------|-----|--------|------------|------------|--------|----|----------|-------|------|--------|--------------------|----|
| COP2<br>0 1 0 0 1 | 0   | f      | mt/op      |            | 0      |    | VS       |       |      | 0      | WAC<br>1 1 1 1 1 0 |    |
| 6                 |     |        | 5          |            | 5      |    | 5        |       |      | 5      |                    |    |
| Format:           | ,   | WAC    | H.QH v     | S          |        |    |          |       |      |        | MDN                | 1X |
| Purpose:          | ,   | To w   | rite secti | s<br>ons ( | of the | Ac | cumulate | or fr | om a | vector | register.          |    |
| Description:      | : ; | acc[i] | .high ←    | vs[        | i]     |    |          |       |      |        |                    |    |

Write the most significant third of the bits of the Accumulator elements. The least significant two thirds of the bits of the Accumulator elements are unaffected.

The field *fmt/op* specifies which of the 8- or 16-bits of the Accumulator to read, as shown below.

WACH.fmt Instruction fmt/op Field

| operation | fmt    | /op    |
|-----------|--------|--------|
| operation | OB Fmt | QH Fmt |
| WACH      | 1000 0 | 100 01 |

This operation is a signal processing operation, no data-dependent exceptions are possible.

A RACL/RACM/RACH followed by WACL/WACH are used to save and restore the Accumulator. This save:restore function is format independent, either format can be used to save or restore Accumulator values generated by either QH or OB operations. There is no implied data conversion; the mapping between element bits of the OB format Accumulator and bits of the same Accumulator interpreted in QH format is implementation specific, but consistent for each implementation.

This instruction is the only instruction that writes a portion of the Accumulator.

Operation:

**Exceptions:** 

Write Accumulator Low

| 31                  | 26 2   | 25 21                                                                                                   | 20 16                          | 15 11                                   | 10 6       | 5 0                |
|---------------------|--------|---------------------------------------------------------------------------------------------------------|--------------------------------|-----------------------------------------|------------|--------------------|
| COP2<br>0 1 0 0 1 ( | C      | fmt/op                                                                                                  | vt                             | VS                                      | 0          | WAC<br>1 1 1 1 1 0 |
| 6                   |        | 5                                                                                                       | 5                              | 5                                       | 5          | 6                  |
| Format:             | M<br>M | /ACL.QH v<br>/ACL.OB v                                                                                  | s, vt<br>s, vt                 |                                         |            | MDMX               |
| Purpose:            | Т      | o load the A                                                                                            | ccumulator                     | from a vector                           | register.  |                    |
| Description:        | a<br>a | $cc[i] \leftarrow \{sigccc[i] \leftarrow \{sigccc[i] \leftarrow \{sigccc[i] \leftarrow \{sigcccc[i] \}$ | n(vs[i]) x 16<br>n(vs[i]) x 8, | , vs[i], vt[i]} fo<br>vs[i], vt[i]} for | r QH<br>OB |                    |

Write the least significant two thirds of the bits of the Accumulator elements. The upper one third of the bits of the Accumulator elements are written by the sign bits of the corresponding elements of vector *vs*[], replicated by 16 or 8, depending on the format.

The field *fmt/op* specifies which of the 8 or 16 bits of the Accumulator to read.

| operation | fmt/op |        |  |  |  |  |  |  |  |
|-----------|--------|--------|--|--|--|--|--|--|--|
| operation | OB Fmt | QH Fmt |  |  |  |  |  |  |  |
| WACL      | 0000 0 | 000 01 |  |  |  |  |  |  |  |

WACL.fmt Instruction fmt/op Field

This operation is a signal processing operation, no data-dependent exceptions are possible.

A RACL/RACM/RACH followed by WACL/WACH are used to save and restore the Accumulator. This save:restore function is format independent, either format can be used to save or restore Accumulator values generated by either QH or OB operations. There is no implied data conversion; the mapping between element bits of the OB format Accumulator and bits of the same Accumulator interpreted in QH format is implementation specific, but consistent for each implementation.

Operation:

WACL fmt

**Exceptions:** 

| Vector | Xor |
|--------|-----|
|--------|-----|

| 31            | 26        | 25 2    | 1 20 | 16 | 15 | 11 | 10 | 6 | 5                  | 0 |
|---------------|-----------|---------|------|----|----|----|----|---|--------------------|---|
| CO<br>0 1 0 0 | P2<br>010 | fmt/sel | vt   |    | vs |    | vd |   | XOR<br>0 0 1 1 0 1 |   |
| 6             | i         | 5       | 5    |    | 5  |    | 5  |   | 6                  |   |
|               |           |         |      |    |    |    |    |   |                    |   |

| Format:  | XOR.QH     | vd, vs, vt                   |  |  |  |  |  |
|----------|------------|------------------------------|--|--|--|--|--|
|          | XOR.OB     | vd, vs, vt                   |  |  |  |  |  |
| Purpose: | To do a bi | To do a bitwise logical XOR. |  |  |  |  |  |

MDMX

Description:  $vd[i] \leftarrow vs[i] XOR select(i,sel,vt)$ 

Each element of vector *vs* is combined with the corresponding element of vector *vt* in a bitwise logical XOR operation. The result is placed in vector *vd*.

The operands and results are values in integer vector format *fmt. sel* selects the values of *vt*[] used for each i. See section C 4 on page C-2 for a description of *fmt/sel* encoding.

This operation is a signal processing operation, no data-dependent exceptions are possible.

The operands must be a value in the specified format. If not, the results are undefined and the values of the operand vectors become undefined.

Operation:

StoreFPR(fd, fmt, ValueFPR(fs,fmt) xor ValueFPR(ft,fmt))

Exceptions: