In computer architecture, a delay slot is an instruction slot being executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branchinstruction on a RISC or DSP architecture; this instruction will execute even if the preceding branch is taken. Thus, by design, the instructions appear to execute in an illogical or incorrect order. It is typical for assemblers to automatically reorder instructions by default, hiding the awkwardness from assembly developers and compilers.
Branch delay slots[edit]
In computer architecture, a delay slot is an instruction slot being executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branch instruction on a RISC or DSP architecture; this instruction will execute even if the preceding branch is taken. All MIPS I control flow instructions are followed by a branch delay slot. Unless the branch delay slot is filled by an instruction performing useful work, an nop is substituted. MIPS I branch instructions compare the contents of a GPR (rs) against zero or another GPR (rt) as signed integers and branch if the specified condition is true. This branch instruction is a simple one, but always remember to fill the delay slot! If the 2 register parameters are not equal take the Branch else execute the instruction after the delay slot. Note: Delay Slot is always executed. This example is definitely longer but it's a very useful pattern for zero filling any memory range.
When a branch instruction is involved, the location of the following delay slot instruction in the pipeline may be called a branch delay slot. Branch delay slots are found mainly in DSP architectures and older RISC architectures. MIPS, PA-RISC, ETRAX CRIS, SuperH, and SPARC are RISC architectures that each have a single branch delay slot; PowerPC, ARM, Alpha, and RISC-V do not have any. DSP architectures that each have a single branch delay slot include the VS DSP, μPD77230 and TMS320C3x. The SHARC DSP and MIPS-X use a double branch delay slot; such a processor will execute a pair of instructions following a branch instruction before the branch takes effect. The TMS320C4x uses a triple branch delay slot.
The following example shows delayed branches in assembly language for the SHARC DSP including a pair after the RTS instruction. Registers R0 through R9 are cleared to zero in order by number (the register cleared after R6 is R7, not R9). No instruction executes more than once.
The goal of a pipelined architecture is to complete an instruction every clock cycle. To maintain this rate, the pipeline must be full of instructions at all times. The branch delay slot is a side effect of pipelined architectures due to the branch hazard, i.e. the fact that the branch would not be resolved until the instruction has worked its way through the pipeline. A simple design would insert stalls into the pipeline after a branch instruction until the new branch target address is computed and loaded into the program counter. Each cycle where a stall is inserted is considered one branch delay slot. A more sophisticated design would execute program instructions that are not dependent on the result of the branch instruction. This optimization can be performed in software at compile time by moving instructions into branch delay slots in the in-memory instruction stream, if the hardware supports this. Another side effect is that special handling is needed when managing breakpoints on instructions as well as stepping while debugging within branch delay slot.
The ideal number of branch delay slots in a particular pipeline implementation is dictated by the number of pipeline stages, the presence of register forwarding, what stage of the pipeline the branch conditions are computed, whether or not a branch target buffer (BTB) is used and many other factors. Software compatibility requirements dictate that an architecture may not change the number of delay slots from one generation to the next. This inevitably requires that newer hardware implementations contain extra hardware to ensure that the architectural behavior is followed despite no longer being relevant.
Load delay slot[edit]
A load delay slot is an instruction which executes immediately after a load (of a register from memory) but does not see, and need not wait for, the result of the load. Load delay slots are very uncommon because load delays are highly unpredictable on modern hardware. A load may be satisfied from RAM or from a cache, and may be slowed by resource contention. Load delays were seen on very early RISC processor designs. The MIPS I ISA (implemented in the R2000 and R3000 microprocessors) suffers from this problem.
The following example is MIPS I assembly code, showing both a load delay slot and a branch delay slot.
See also[edit]
External links[edit]
In this lab we are going to useMPLAB® X IDE and its associateXC32 compiler to write and debug a PIC32 assembler program.The MPLAB Xsoftware is NetBeans based and will run under Linux, Mac and Windows.
You can download your own copy of MPLAB X fromMicrochip's MPLAB X download page.You will want a copy of the following:
- MPLAB X IDE – v2.20 is installed in the lab
- MPLAB XC 32 Compiler – v1.33 is installed in the lab
You can get a short list of instructions from theMIPS 'GreenCard'.Section 2 of the PIC32 Family Reference Manual,CPU for Devices with M4K coreIf you want 274 pages about MPLAB X, check out theMPLAB® X IDE User's Guide.The definitive guide to the XC32 assembler is the 234 pageMPLAB®XC32 Assembler, Linker and Utilities User's Guide.Information relevant to the assembler is also contained in theMPLAB®XC32 C/C++ User's Guide.
Getting Started
From the command line you can typemplab_ideor you can type 'mplab' in the search box at thetop of the launcher.Be sure to select MPLAB IDEand notMPLAB IPE.(IPE is Integrated Programming Environment.In embedded system design,'programming' is the process of downloading codeto the chip.)
Delay Slot Instruction Mips Instructions
If you've used NetBeans,you'll feel at home with.MPLAB X.
First Project in MPLAB X
Creating the project
Use the menu choicesFile ⇒New Project..to begin the process of creating a project.Then work your way through a few windows.
- In the Choose Panel window, from the Microchip Embedded category choose a Standalone Project.
- In the Select Device window, select any device from the 32-bit MCUs (PIC32) family. I chose PIC32MX250F256H. Next time you'll be able to speed up this process by choosing the Recently Used family.
- In the Select Tool window, go for the Simulator.
- In the Select Compiler window, select a XC 32. There should be only one choice.
- In the Select Project Name and Folder window, think of a clever project name and and click Set as main project.
You can click on the following images, if you think they are too small.
You may have noticed that many of selected choices were preceded bya little green dot. Nano signal slot. Avoid the ones with the red and yellow dots.
Let's mention a couple of things before going on.
Dream casino ndb codes. There were a lot of devices to choose from.If you are using the simulator, as we are today, you don'thave to be that precise in your selection, but usually you must choose the device the that matches the oneyou plan is use in your project.
The XC 32 is Microchip's latest compiler for its 32-bit processors.We are using the free (unlicensed) version.The free compiler is based on the gcc toolchainand it does notoptimizeyour C code.It will cost you about $1000 to get the optimizing 'PRO' compiler.
Also, notice that your projects are going to be stored in directories with names that end with a capital X, such asCSCI255rocks .
Click on the name of your project and then selectProperties.Make sure you havechosen well.
Checking it out the IDE
At this point you have a NetBeans environment that will be familiar toalumni of CSCI 181 and 202.Move your mouse over the menu choices at the top of the window,from File toHelp.Press on the choices to look at their submenus.Pay particular attention to items under Debug.Most of the choices are presently grayed out, because they can't be used until you are working on a project.
Notice that the lower left corner is occupied by aDashboard display.The Microchip PIC devices have very little memory so we need aneasy-to-use means of figuring out how much memory our programs are using.
Adding an assembler program
Now we'll use the menu to create an assembler program.Start with the menu choices File ⇒New File..
- In the Choose File Type window, take category Assembler category and type AssemblyFile.s . You must to make the .s choice.
- In the Name and Location window, choose a name for your file. I suggest something like whatever . MPLAB X will add the .s to your filename.
Best in slot priest gear vanilla. At this point, you should have an empty program in theupper-right window.Make sure that your program really is underSource Files.
Copy the following programinto your empty window.
Go ahead and press the hammer to built it, so we can make sureyour installation is working.
This program is the start of an assembly languageimplementation of this statement which can be inJava, C++ or C.
What's it all about
But clearly this isn't Java, C++ or C.
Let's look at this program for a minute.Like most assembly language programs, this one contains severalpseudo-ops or directives.These are lines of code that don't create instructions.They may define space for variables or control the assembly processor even control the spacing for a printout of your code.
The program starts with the directive:#include
which would a legal statement in either C++ or C.This causes the assembler to include a file defining useful constantsfor programming PIC microcontrollers.Open up the file/opt/microchip/xc32/v1.33/pic32mx/include/xc.h in MPLAB X.Be sure to useOpen File.. and notOpen Project.. !
That one isn't that interesting. It's just a list of include filesfor several different PIC processors. Try again, but this time open/opt/microchip/xc32/v1.33/pic32mx/include/proc/p32mx250f256h.h in MPLAB X.UsingEdit ⇒Find.. or simplyCtrl+F find the definition ofPORTB
, the special function register you used in theArduino lab.The volatile
keyword is common to Java and C and denotes a variablethat can be changed by external forces.
We are serious. You need to know how to navigate the file system fromthe IDE. Show us the line defining PORTB
.
Now close those two big system files and get back to your program.
The second line in your program is also a directive: .global main
This causes the assembler to announcemain
as an external variable of your program.This means that the outside 'world' will know aboutmain
.This is a bit like declaring a method public
in Java.
Since ancient times, running programs have been considered to consist of four segments:(1) the text segment, which contains compiled code;(2) the stack segment, which contains local variables used by functions or methods;(3) the heap segment, which contains dynamically allocated memory, such as Java objects; and(4) the data segment, which contains global variables. The data area also contains the oddly name bss (block started by symbol) segment which contains space for uninitialized data.In this program, you see that both a text and data segment are defined.The data segment reserves room for the five variablesa
, b
, c
and z
.The text segment contains the PIC instructions.
Delay Slot Instruction Mips Helmet
Executing a program
The Microchip simulator does know how to simulate the PIC instructionsof your program, but right now your program does littlemore than loop forever.To see anything interesting you must step through the program.
Getting ready to run
To do this you need to know how to set breakpoints in your program.Move your mouse onto the narrow column of program line numbers just to the left of the first real instruction,the lw
, of your program and click.There should put a little red square in the line number columnand highlight the entire line in red to show that you haveset a breakpoint.
Now use the menu choicesDebug⇒ Debug Project. This should start thesimulation but stop at the breakpoint.
Put the break in your program and run your program to the breakpoint.If you have done this before using NetBeans, help out any labneighbors who haven't.
You will notice a bunch of new tabs in the lower right panel.One of them is calledVariables.Go over to the Variables taband add a watch for all your variables.
However, all the action if with the registers.Use the window menu choicesWindow ⇒PIC Memory Views⇒ CPU Registersto bring up a new tab called CPU Memory.Use the arrow keys to find registers $t0
to $t5
in this display.
Branch delay slots[edit]
In computer architecture, a delay slot is an instruction slot being executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branch instruction on a RISC or DSP architecture; this instruction will execute even if the preceding branch is taken. All MIPS I control flow instructions are followed by a branch delay slot. Unless the branch delay slot is filled by an instruction performing useful work, an nop is substituted. MIPS I branch instructions compare the contents of a GPR (rs) against zero or another GPR (rt) as signed integers and branch if the specified condition is true. This branch instruction is a simple one, but always remember to fill the delay slot! If the 2 register parameters are not equal take the Branch else execute the instruction after the delay slot. Note: Delay Slot is always executed. This example is definitely longer but it's a very useful pattern for zero filling any memory range.
When a branch instruction is involved, the location of the following delay slot instruction in the pipeline may be called a branch delay slot. Branch delay slots are found mainly in DSP architectures and older RISC architectures. MIPS, PA-RISC, ETRAX CRIS, SuperH, and SPARC are RISC architectures that each have a single branch delay slot; PowerPC, ARM, Alpha, and RISC-V do not have any. DSP architectures that each have a single branch delay slot include the VS DSP, μPD77230 and TMS320C3x. The SHARC DSP and MIPS-X use a double branch delay slot; such a processor will execute a pair of instructions following a branch instruction before the branch takes effect. The TMS320C4x uses a triple branch delay slot.
The following example shows delayed branches in assembly language for the SHARC DSP including a pair after the RTS instruction. Registers R0 through R9 are cleared to zero in order by number (the register cleared after R6 is R7, not R9). No instruction executes more than once.
The goal of a pipelined architecture is to complete an instruction every clock cycle. To maintain this rate, the pipeline must be full of instructions at all times. The branch delay slot is a side effect of pipelined architectures due to the branch hazard, i.e. the fact that the branch would not be resolved until the instruction has worked its way through the pipeline. A simple design would insert stalls into the pipeline after a branch instruction until the new branch target address is computed and loaded into the program counter. Each cycle where a stall is inserted is considered one branch delay slot. A more sophisticated design would execute program instructions that are not dependent on the result of the branch instruction. This optimization can be performed in software at compile time by moving instructions into branch delay slots in the in-memory instruction stream, if the hardware supports this. Another side effect is that special handling is needed when managing breakpoints on instructions as well as stepping while debugging within branch delay slot.
The ideal number of branch delay slots in a particular pipeline implementation is dictated by the number of pipeline stages, the presence of register forwarding, what stage of the pipeline the branch conditions are computed, whether or not a branch target buffer (BTB) is used and many other factors. Software compatibility requirements dictate that an architecture may not change the number of delay slots from one generation to the next. This inevitably requires that newer hardware implementations contain extra hardware to ensure that the architectural behavior is followed despite no longer being relevant.
Load delay slot[edit]
A load delay slot is an instruction which executes immediately after a load (of a register from memory) but does not see, and need not wait for, the result of the load. Load delay slots are very uncommon because load delays are highly unpredictable on modern hardware. A load may be satisfied from RAM or from a cache, and may be slowed by resource contention. Load delays were seen on very early RISC processor designs. The MIPS I ISA (implemented in the R2000 and R3000 microprocessors) suffers from this problem.
The following example is MIPS I assembly code, showing both a load delay slot and a branch delay slot.
See also[edit]
External links[edit]
In this lab we are going to useMPLAB® X IDE and its associateXC32 compiler to write and debug a PIC32 assembler program.The MPLAB Xsoftware is NetBeans based and will run under Linux, Mac and Windows.
You can download your own copy of MPLAB X fromMicrochip's MPLAB X download page.You will want a copy of the following:
- MPLAB X IDE – v2.20 is installed in the lab
- MPLAB XC 32 Compiler – v1.33 is installed in the lab
You can get a short list of instructions from theMIPS 'GreenCard'.Section 2 of the PIC32 Family Reference Manual,CPU for Devices with M4K coreIf you want 274 pages about MPLAB X, check out theMPLAB® X IDE User's Guide.The definitive guide to the XC32 assembler is the 234 pageMPLAB®XC32 Assembler, Linker and Utilities User's Guide.Information relevant to the assembler is also contained in theMPLAB®XC32 C/C++ User's Guide.
Getting Started
From the command line you can typemplab_ideor you can type 'mplab' in the search box at thetop of the launcher.Be sure to select MPLAB IDEand notMPLAB IPE.(IPE is Integrated Programming Environment.In embedded system design,'programming' is the process of downloading codeto the chip.)
Delay Slot Instruction Mips Instructions
If you've used NetBeans,you'll feel at home with.MPLAB X.
First Project in MPLAB X
Creating the project
Use the menu choicesFile ⇒New Project..to begin the process of creating a project.Then work your way through a few windows.
- In the Choose Panel window, from the Microchip Embedded category choose a Standalone Project.
- In the Select Device window, select any device from the 32-bit MCUs (PIC32) family. I chose PIC32MX250F256H. Next time you'll be able to speed up this process by choosing the Recently Used family.
- In the Select Tool window, go for the Simulator.
- In the Select Compiler window, select a XC 32. There should be only one choice.
- In the Select Project Name and Folder window, think of a clever project name and and click Set as main project.
You can click on the following images, if you think they are too small.
You may have noticed that many of selected choices were preceded bya little green dot. Nano signal slot. Avoid the ones with the red and yellow dots.
Let's mention a couple of things before going on.
Dream casino ndb codes. There were a lot of devices to choose from.If you are using the simulator, as we are today, you don'thave to be that precise in your selection, but usually you must choose the device the that matches the oneyou plan is use in your project.
The XC 32 is Microchip's latest compiler for its 32-bit processors.We are using the free (unlicensed) version.The free compiler is based on the gcc toolchainand it does notoptimizeyour C code.It will cost you about $1000 to get the optimizing 'PRO' compiler.
Also, notice that your projects are going to be stored in directories with names that end with a capital X, such asCSCI255rocks .
Click on the name of your project and then selectProperties.Make sure you havechosen well.
Checking it out the IDE
At this point you have a NetBeans environment that will be familiar toalumni of CSCI 181 and 202.Move your mouse over the menu choices at the top of the window,from File toHelp.Press on the choices to look at their submenus.Pay particular attention to items under Debug.Most of the choices are presently grayed out, because they can't be used until you are working on a project.
Notice that the lower left corner is occupied by aDashboard display.The Microchip PIC devices have very little memory so we need aneasy-to-use means of figuring out how much memory our programs are using.
Adding an assembler program
Now we'll use the menu to create an assembler program.Start with the menu choices File ⇒New File..
- In the Choose File Type window, take category Assembler category and type AssemblyFile.s . You must to make the .s choice.
- In the Name and Location window, choose a name for your file. I suggest something like whatever . MPLAB X will add the .s to your filename.
Best in slot priest gear vanilla. At this point, you should have an empty program in theupper-right window.Make sure that your program really is underSource Files.
Copy the following programinto your empty window.
Go ahead and press the hammer to built it, so we can make sureyour installation is working.
This program is the start of an assembly languageimplementation of this statement which can be inJava, C++ or C.
What's it all about
But clearly this isn't Java, C++ or C.
Let's look at this program for a minute.Like most assembly language programs, this one contains severalpseudo-ops or directives.These are lines of code that don't create instructions.They may define space for variables or control the assembly processor even control the spacing for a printout of your code.
The program starts with the directive:#include
which would a legal statement in either C++ or C.This causes the assembler to include a file defining useful constantsfor programming PIC microcontrollers.Open up the file/opt/microchip/xc32/v1.33/pic32mx/include/xc.h in MPLAB X.Be sure to useOpen File.. and notOpen Project.. !
That one isn't that interesting. It's just a list of include filesfor several different PIC processors. Try again, but this time open/opt/microchip/xc32/v1.33/pic32mx/include/proc/p32mx250f256h.h in MPLAB X.UsingEdit ⇒Find.. or simplyCtrl+F find the definition ofPORTB
, the special function register you used in theArduino lab.The volatile
keyword is common to Java and C and denotes a variablethat can be changed by external forces.
We are serious. You need to know how to navigate the file system fromthe IDE. Show us the line defining PORTB
.
Now close those two big system files and get back to your program.
The second line in your program is also a directive: .global main
This causes the assembler to announcemain
as an external variable of your program.This means that the outside 'world' will know aboutmain
.This is a bit like declaring a method public
in Java.
Since ancient times, running programs have been considered to consist of four segments:(1) the text segment, which contains compiled code;(2) the stack segment, which contains local variables used by functions or methods;(3) the heap segment, which contains dynamically allocated memory, such as Java objects; and(4) the data segment, which contains global variables. The data area also contains the oddly name bss (block started by symbol) segment which contains space for uninitialized data.In this program, you see that both a text and data segment are defined.The data segment reserves room for the five variablesa
, b
, c
and z
.The text segment contains the PIC instructions.
Delay Slot Instruction Mips Helmet
Executing a program
The Microchip simulator does know how to simulate the PIC instructionsof your program, but right now your program does littlemore than loop forever.To see anything interesting you must step through the program.
Getting ready to run
To do this you need to know how to set breakpoints in your program.Move your mouse onto the narrow column of program line numbers just to the left of the first real instruction,the lw
, of your program and click.There should put a little red square in the line number columnand highlight the entire line in red to show that you haveset a breakpoint.
Now use the menu choicesDebug⇒ Debug Project. This should start thesimulation but stop at the breakpoint.
Put the break in your program and run your program to the breakpoint.If you have done this before using NetBeans, help out any labneighbors who haven't.
You will notice a bunch of new tabs in the lower right panel.One of them is calledVariables.Go over to the Variables taband add a watch for all your variables.
However, all the action if with the registers.Use the window menu choicesWindow ⇒PIC Memory Views⇒ CPU Registersto bring up a new tab called CPU Memory.Use the arrow keys to find registers $t0
to $t5
in this display.
By checking within the Hex or columns of theCPU Registers, you can change thevalues of registers.
Set the values ofregisters $t0
to $t5
.
Now useWindow ⇒PIC Memory Views⇒ Execution Memoryto see your assembly code translated into machine codein the Opcode column.The lw
assembly language statement hasbeen translated into a two-instruction machine language sequence.We will explain that in class.
Completing the program
Right now only three of the ten simple assignment statements have beencompleted.Complete the remaining seven and build your program.
You will need to use the sw
instruction for the lastassignment.
Debugging with the simulator
It's time to run some code.Use the F8 key to step through your programwhile looking at theCPU Registers to see changes in theregisters.It will not take long until something goes wrong.
Step through the program one instruction at a time.After each instruction, verify that the expected result has beenstored in the destination register.
Stop at the first instruction where something does wrong.
More problems with delay
There are some MIPS instructions that do not immediately produce a result.One of these is the jump (j spin
) instructionnear the end of your program. It does not jump immediately, butalways execute one more instruction in adelay slotbefore transferring to its target.In our program, the delay slot is filled with anop
instruction.
The MPLAB X simulator also believes that the mul
instruction requires one delay slot before it can store the result ofthe multiplication in the destination register. Add some nop
instructions to your program to provide for these delay slots.
Again step through the program one instruction at a time.
Run your program until it stores 297, 0x129 inhexadecimal, in z
.
Improving your program
You can make your program a little by using those delay slots for usefulwork.Figure out how to fill a delay slot bymoving a lw
into a nop
slot.This is not a hard way to reduce number of instructions by two.
You could also perform $t5 = b*x
a little earlierand get rid of the last nop
.
However, not all instructions require the same amount of time.A multiplication take many more clock cycles than an addition.If you really want to speed up your program, use the following assignmentto reduce the number of multiplications from three to two: z = (a*x+b)*x + c
If time permits, make at least one improvement to your program.
In case of trouble..
If your windows get completely messed up, try some of the following:
- Window ⇒ Reset Windows
- Window ⇒ Output ⇒ Output
- Window ⇒ Debugging ⇒ Variables