

# CSIM's General Blocks Library

Jan 2017



# Outline

- History
- Why General Blocks?
- Advantages
- Disadvantages
- Status
- Library Description
- Example simulations





## History

- The General Blocks library was developed as a replacement for BONeS (Block Oriented Network Simulator)
  - BONeS was developed at the University of Kansas, and was commercially available from ~1988-1999 (Comdisco/Alta/CADence)
  - BONeS was used by a significant community
  - The General Blocks library was initially developed 1999-2002. Significant enhancements have occurred since, and are ongoing.



# Why Use General Blocks?

- Advantages of the General Blocks library
  - Ability to leverage significant residual BONeS expertise, compare known results
  - The block oriented approach (>330 mostly small, simple blocks) enables fine granularity in architecture definition and tracing
  - Extremely flexible; can easily implement new modules that would be more difficult in other model libraries



### Advantages

- Sophisticated models for server resources

   Priority, preemption, round robin
- Popups provide message details for selected blocks
- Extremely flexible mechanism for representing messages (data\_structs.txt)
- Minimum need to become involved with C code
- Many built-in statistical models and display mechanisms
- Significant upgrades recently to help accelerate the initial design/debug cycle
- Useful for modeling data-processing systems, or other systems (ex. not signal processing), not covered by DFG modeling methods.



## Disadvantages

- Does not inherently separate hardware from software
  - Cannot use DFG (Data Flow Graph) Schedulers.
- Can be computationally inefficient for large models (flip side of *flexible*)
- Oriented toward static topologies
- Not specialized for modelling specific kinds of systems.



#### CSIM: an Open Architecture Tool

- CSIM is based upon a "toolbox" approach
  - "CSIM" is actually the assembly of many independent tools and libraries; it is not a monolithic ("stovepipe") chunk of code
  - The key independent tools/libraries include:
    - CSIM precompiler
    - CSIM kernel library
    - GUI
    - Simview
    - XGraph
    - NumUtils, general\_utils
    - The Model Libraries



#### CSIM: an Open Architecture Tool

- CSIM is based upon a "toolbox" approach
  - CSIM leverages the existence of available applications, tools, utilities and standards
    - Minimizes CSIM-specific development, maintenance and documentation ("avoid re-inventing the wheel")
    - Examples of applications/tools/libraries leveraged
      - Compilers (cc, gcc, etc.)
      - Debuggers (gdb, ddd, etc.)
      - Text editors (vi, emacs, wordpad, textedit, etc.)
      - Libraries (C language, GTK, OpenGL, Motif, OTK, etc.)
      - Graphical viewers/editors (xv, gimp, etc.)
      - Data standards (xml, xpm, etc.)



# Application

- Typically, the General Blocks Library is used to model and simulate networked computer resources to:
  - Identify points of contention
  - Estimate performance limits or bottlenecks
  - Evaluate processor utilizations
  - Evaluate system latencies
  - -etc.
- The types of outputs typically obtained include:
  - Scatter plots, histograms and statistical measures of latency data



### Status

• The General Blocks library currently contains more than 330 models in the following groups:

\_\_\_

- \* Arithmetic
- \* Comparison
- \* Conversions
- \* Counters
- \* Data Type Operations
- \* Data Structure Access
- \* Delays
- \* Execution Control
- \* File Access
- \* Generators
- \* Logical
- \* Loops

- \* Memory
- \* Miscellaneous
- \* Plot Generation
- \* Quantity Shared Resource
  - \* Queues And Servers
  - \* Probes
  - \* Queues
- \* Servers
- \* Statistics
- \* Switches
- \* Timers
- \* Traffic Generators



## **Most Recent Additions**

- Additional models have been added to the library
- These new models include:
  - Admin
  - Append\_Route\_List
  - Append\_String
  - Generic\_Batcher
  - Generic\_UnBatcher
  - LockRealTime
  - Num\_to\_String
  - PlotLive
  - PortConvert
  - QSR1
  - QSR2
  - Receiver
  - Router
  - Sender
  - Switch\_5way

#### **General Blocks Library Devices**

#### New\_Models Generic\_Batcher Generic\_UnBatcher Receiver Sender PlotLive LockRealTime QSR1 QSR2 Switch\_5way

#### Vectors

VCreate Setup\_VElem VLen Access\_Vector GVCreate Setup\_GVElem GVLen Access GVector

Traffic\_Generators Uniform\_PulseTrain Poisson\_PulseTrain Enabled\_Uniform\_PulseTrain Enabled\_Poisson\_PulseTrain Arbitrary\_PulseTrain

#### Timers

Start\_Timer Set\_Alarm Service\_Timer Residual\_Time Reset\_Timer Cancel\_Timer Cancel\_Alarm Alarm\_Active

Switches True\_N\_Times T GT Startup T GE ParamSwitch 4way Switch Real Within Boundaries Rand Switch Param Rand Switch R LT C R\_LE\_C R GT C R GE C R EQ C MemorySwitch I\_LT\_C I LE C I GT C I\_GE\_C I\_EQ\_C Enabled Switch **Bypass** 

Statistical WeightedMeanAndVariance Weighted\_General\_Moments Throughput Time\_Average MeanAndVariance Histogram Global\_Statistics General Nth Moment Find Bin Dimensioned\_Time\_Average Dimensioned\_Ensemble\_Average Construct\_TimeAverage\_Stats Construct Dimensioned Stats Batch\_Timing **Batch Statistics** Batch\_Rmin Batch\_Rmax Batch\_Mean Average

Server\_Resource SR\_Server\_Utilization\_Probe SR\_Server\_Response\_Probe SR\_Server\_Cocupancy\_Probe SR\_Preempt\_Server\_Utilization\_Probe SR\_Preempt\_Server\_Utilization\_Probe SR\_Preempt\_Server\_Response\_Probe SR\_Preempt\_Server\_Occupancy\_Probe SR\_Preempt\_Server\_Occupancy\_Probe Set\_Resource Set\_Resource Set\_Preempt\_Resource Service\_wRoundRobin Service\_wPriority\_Preemption Service wPriority

QueuesAndServers FIFOwServers MultipleServers ParallelQueues PQwServers

Queues Simple\_LIFO Simple\_FIFO FIFOwPriority FIFO\_wPeek

QuantityShared\_Resource Set\_QResource FreeBasic Free ConsumeResourceUnits ChangeCapacity AllocatePriority AllocateParam AllocateBasic Probes WriteTnow ThroughputDelayProbe ThroughputVsTimeProbe **TextualDescriptionProbe** SystemLatencyProbe ScatterPlotZ ScatterPlotQ ScatterPlot SelectFieldProbe RealysTimeProbe ProcessTimeLineProbe InsertStatFields HistogramProbeF2 F1 HistogramProbe GenericProbe GenericHyperGraphProbe EventProbe\_with\_Comm EventProbe CreateCDFfileInit CreateCDFfileF2 F1 CreateCDFfile BatchStatisticsProbe f2 f1 BatchStatisticsProbe BatchNthMomentProbe f2 f1 BatchNthMomentProbe BatchMeanProbe f2 f1 BatchMeanProbe

Plot\_Generation BuildPlot\_Ytime BuildPlot\_Yonly BuildPlot\_Y BuildPlot\_XY BuildPlot BuildHistogram



Number Generators UserCDF RanGen UniformRangenParam UniformRangen U 0 to 1 RanGen TStop TNow Rconst PoissonRangenParam PoissonRangen N01 Rangen NormalRangen NormalRangenParam IU Parem IU NE C IU MinMax Param IU MinMax IU Iconst GammaRangenParam GammaRangen ExponRanGenParam ExponRanGen BinomialRangenParam BinomialRangen

Miscellaneous TimeBetweenTriggers SystemCall ServiceSetup Print\_message PrintEnvelope Print\_real Print\_int Dijkstra Central\_Utilities Ack\_Setup

#### General Blocks Library Devices II

Number Generators UserCDF RanGen UniformRangenParam UniformRangen U 0 to 1 RanGen TStop TNow Rconst PoissonRangenParam PoissonRangen N01 Rangen NormalRangen NormalRangenParam IU Parem IU\_NE\_C IU MinMax Param IU MinMax IU Iconst GammaRangenParam GammaRangen ExponRanGenParam ExponRanGen BinomialRangenParam BinomialRangen Miscellaneous TimeBetweenTriggers

SystemCall

ServiceSetup

Print message

PrintEnvelope

Central Utilities

Print\_real

Print\_int

Dijkstra

Ack\_Setup

Memory WriteMemory RealLocalMem ReadMemory MultipleBuffers Mem\_increment Mem\_decrement LocalMem\_wCopy LocalMemRef LocalMem IntLocalMemory ActiveReadMemory

Loops Real\_Do\_Param Real\_Do Int\_Do\_Param Int\_Do\_1\_N Int\_Do\_0\_Nminus1 Int\_Do

Logical False True Nxor Xor Nor Nand Not Or And Graphical\_Interface

Slider\_box PromptInt PromptFloat PopUpMessage Navigate\_View MPGraph Hilite\_Box GenericProbePopup ColorController ColorBox Button\_box

File\_Access WriteInfo\_Numeric WriteFile\_String WriteFile\_Real WriteFile\_Field WriteFile\_AppendField WriteFile\_Int ReadFile\_String ReadFile\_Real ReadFile\_Line ReadFile\_Lint OpenFileWrite OpenFileRead OpenFileAppend CloseFile

Execution\_Control Wrapup Terminate OneWay OnePulse Merge Init Gate\_Switch Gate Execute\_in\_order\_4 Execute\_in\_order\_3 Execute\_in\_order

| <b>J</b>                  |           |
|---------------------------|-----------|
| Delays                    | Counters  |
| FixedProcDelay            | UpDowr    |
| FixedAbsDelay             | UpDowr    |
| AbsDelay                  | SimpleC   |
|                           | Int_Accu  |
|                           | GlobalC   |
| Data_Structure_Access     | Counter   |
| TypeSwitch                | Circular  |
| SelectField               | Accumu    |
| MakeRealDS                |           |
| InsertTNow                |           |
| InsertMultipleFieldParams | Conversio |
| InsertMultipleTNow        | Truncate  |
| InsertFieldTNow           | Round     |
| InsertFieldParam          | Int_to_R  |
| InsertField               |           |
| Declare_DS                |           |
| Create_DS                 | Compari   |
| Coerce_DS                 | StringEq  |
|                           | Set_Equ   |
|                           | R_LessT   |
| Data_Structure_Operations | R_LessT   |
| TypeOf                    | R_Great   |
| TypeConst                 | R_Great   |
| TypeCompatible            | R_Equal   |
| Tequals                   | Odd       |
| Split_wDelay              | I_LessTl  |
| Split3                    | I_LessTl  |
| Split                     | I_Greate  |
| Sink                      | I_Greate  |
| Junction                  | I_Equals  |
| Join                      | Even      |
| Copy2                     |           |
| CopyDS_wDelay             |           |
| CopyDS                    | Arithmeti |
|                           | Increme   |

DownCounterChangeValue DownCounter pleCounter Accumulator balCount inter ularCounter umulator rsions ncate ınd to Real nparison ngEqualsParam Equals LessThanOrEqual LessThan GreaterThanOrEqual GreaterThan Equals essThanOrEqualE essThan reaterThanOrEqual reaterThan quals n metic Increment Decrement

I add

I subtract

I mult I\_div I\_divprotect Imod I mod Iabs Imin Imax Ichs Igain R\_add R subtract R mult R\_div R divprotect Rsqrt Rabs Rmin Rmax Rchs Rgain sin\_X cos\_X tan\_X ln X exp\_X X powr Iconst X\_powr\_Y five\_input\_expression one\_input\_expression\_R one\_input\_expression\_I Rlimiter Ilimiter Reciprocal General\_Expression

CS

Imult



# Library Configuration

- General Blocks based simulations generally utilize several libraries
- All.sim contains the basic elements (devices) of the General Blocks library.
- Library.sim contains information to group the All.sim models into manageable hierarchical groups
- One or more local libraries, containing module level and sometimes device level models, are generally referenced
- The User's simulation model will reference these libraries



# General Blocks (GB) Files



- Library.sim is used to organize the models into logical groupings for display and access by the CSIM gui
- *All.sim* contains the detailed implementation code for all of the models in the distributed GB library
- data\_structs.txt contains the definitions for all compound data structures (message definitions) that will be used in simulation
- All simulations will require an *All.sim* and a *data\_structs.txt*; *Library.sim* is optional (although very useful).
- task\_table.dat is required for simulations which require the Admin model
- CSIM will provide additional object files.



## Excerpts from Library.sim

- %include \$CSIM\_ROOT/model\_libs/general\_blocks/All.sim
- <DEFINE\_LIBRARY> Counters
- <MODEL> UpDownCounterChangeValue </MODEL>
- <MODEL> UpDownCounter </MODEL>
- <MODEL> SimpleCounter </MODEL>
- <MODEL> Int\_Accumulator </MODEL>
- <MODEL> GlobalCount </MODEL>
- <MODEL> Counter
   </MODEL>
- <MODEL> CircularCounter </MODEL>
- <MODEL> Accumulator
- </DEFINE\_LIBRARY>
- <DEFINE\_LIBRARY> Conversions
- · <MODEL> Truncate </MODEL>
- · <MODEL> Round </MODEL>
- <MODEL> Int\_to\_Real
   </MODEL>
- </DEFINE\_LIBRARY>
- <DEFINE\_LIBRARY> Comparison
- <MODEL> StringEqualsParam </MODEL>
- <MODEL> Set\_Equals
   </MODEL>
- <MODEL> R\_LessThanOrEqual </MODEL>
- <MODEL> R\_LessThan </model>
- <MODEL> R\_GreaterThanOrEqual </MODEL>
- <MODEL> R\_GreaterThan
   </MODEL>
- <MODEL> R\_Equals </MODEL>
- <MODEL> Odd
   </MODEL>
- <MODEL> I\_LessThanOrEqualE </MODEL>
- <MODEL> I\_LessThan
   </MODEL>
- <MODEL> I\_GreaterThanOrEqual </MODEL>
- <MODEL> I\_GreaterThan
   </MODEL>
- <MODEL> I\_Equals
   </MODEL>
- <MODEL> Even
   </MODEL>
- </DEFINE\_LIBRARY>

- <DEFINE\_LIBRARY> Plot\_Generation
- <MODEL> BuildPlot\_Ytime </MODEL>
- <MODEL> BuildPlot Yonly
  - <MODEL> BuildPlot Y </MODEL>
- <MODEL> BuildPlot XY
  - <MODEL> BuildPlot </MODEL>

</MODEL>

</MODEL>

- <MODEL> BuildHistogram </MODEL>
- </DEFINE\_LIBRARY>

•

•

- <DEFINE\_LIBRARY> Number\_Generators
- <MODEL> UserCDF\_RanGen </MODEL>
- <MODEL> UniformRangenParam </MODEL>
- <MODEL> UniformRangen </MODEL>
- <MODEL> U\_0\_to\_1\_RanGen </MODEL>
- <MODEL> TStop
   </MODEL>
- <MODEL> TNow
   </MODEL>
- <MODEL> Rconst </MODEL>
- <MODEL> PoissonRangenParam </MODEL>
- <MODEL> PoissonRangen </MODEL>
- <MODEL> N01\_Rangen </MODEL>
- <MODEL> NormalRangen </MODEL>
- <MODEL> NormalRangenParam </MODEL>
- <MODEL> IU\_Parem </MODEL>
- <MODEL> IU\_NE\_C </MODEL>
- <MODEL> IU\_MinMax\_Param
   </MODEL>
- <MODEL> IU\_MinMax 
   </MODEL>
- <model> IU </model>
  - <MODEL> Iconst </MODEL>
- <MODEL> GammaRangenParam </MODEL>
- <MODEL> GammaRangen </MODEL>
- <MODEL> ExponRanGenParam </MODEL>
- <MODEL> ExponRanGen </MODEL>
- <MODEL> BinomialRangenParam </MODEL>
- <MODEL> BinomialRangen </MODEL>
- </DEFINE\_LIBRARY>

#### Example Model from All.sim



#### DEFINE\_DEVICE\_TYPE: R\_add

```
PORT LIST( in1, in2, out );
DOCUMENTATION:
/* The model adds the value of in1 and in2
                                                 */
/* Input Ports
                                                 */
/* in1 Data Type: REAL
                                                 */
/* in2 Data Type: REAL
                                                 */
/* Output Ports
                                                 */
/* out Data Type: REAL
                                                 */
                                                 */
/* Parameters( none )
END DOCUMENTATION.
DEFAULT ICON( $CSIM MODEL LIBS/general blocks/lcons/2 1.ppm );
DEFINE THREAD: start up`
Envelope *a, *b;
float x, y; in len;
 while (1)
  RECEIVE( "in1", &a, &len );
  x = consume real(a);
  RECEIVE( "in2", &b, &len );
  y = consume real(b);
  x = x + y;
  a = make real envelope(x);
  SEND( "out", a, 1);
 }
END DEFINE THREAD.
```

END\_DEFINE\_DEVICE\_TYPE.

# General Blocks Messages



- Data structures are used to represent messages.
- In the General Blocks Library, there can be several types of messages
  - "Simple" data structures definitions are built-in
    - Int, real, string
  - Compound data structures are defined by the user in the data\_structs.txt file
    - Assemblies of simple data structures
- Typically, data structures contain several fields:
  - Some may contain information about the message, i.e. message size, message priority, message creation time
  - Others may be used to hold information about the system state, probe data, calculation results, etc.
- Some General Blocks "devices" (i.e. Built-in models) operate with compound data structures
  - Others require specific simple data structures
- User models may require specific compound data structures



### Example data\_structs.txt

<DEFINE\_DATA\_STRUCTURES>

```
struct Throughput_Delay_DS
{
    real Mean_Delay=0
    real Var_Delay=0
    real Mean_Throughput=0
    real Var_Throughput=0
    int Nsamples
}
```

```
struct Basic_Statistic
{ real mean
   real variance
   real min
   real max
   int Nsamples=0
}
```

```
struct Timing_Packet
{ real Time_Created
  real Intermediate_Time
  real Time_Finished
  int Length
  int Type
}
```

struct Event\_Data
{ real EVENT\_START\_TIME=0
 int EVENT\_SEQUENCE\_NUMBER=0
 int EVENT\_TYPE\_PARAMS\_INDEX=0
 real PREV\_LINKED\_EVENT\_START\_TIME=0
 int PREV\_LINKED\_EVENT\_SEQ\_NUMBER=0
 int SOFT\_RESET\_COMMAND=0
 real EVENT\_LENGTH\_X\_100\_NSEC=1000
}

struct Application\_Message\_Transaction\_DS
{ int Application\_Message\_Type\_Code=0
 int Application\_Message\_Sequence\_Number=0
 int Application\_Message\_Source=0
 int Application\_Message\_Destination=0
 int Application\_Message\_Size\_Bytes=10
 int Application\_Message\_Priority=0
 real Application\_Message\_Create\_Time=0
 real Application\_Message\_Complete\_XMIT\_Time=0
 real Application\_Message\_RCV\_Complete\_Time=0
 real Application\_Message\_Destination\_Time=0
 real Application\_Message\_User\_Data
 gvec My\_Vector\_Data
}

</DEFINE\_DATA\_STRUCTURES>



# Creating Legible Models

#### General "rules"

- -Limit the number of boxes to about 15
- -Orient the flow to run top to bottom rather than left to right
- -Maximize use of the available canvas
- -Jog wires to improve signal legibility



## **Example of Message Flows**

 The compound data structure used here is:

<DEFINE\_DATA\_STRUCTURES> struct CompuSys

char MsgType=Heartbeat char StackACK char ACK=NoACK int NUMBER int MsgLENGTH int PRIORITY real CREATED real COMPLETED real MEAN real EARLIEST real LATEST real INTERMEDIATE

</DEFINE\_DATA\_STRUCTURES>





## Data Structures Approach

- The Data\_Type\_Container (Envelope) is the atomic component of data structures for the general blocks library
- Compound data structures are built from linked lists of Envelopes
- Organization of an Envelope:

```
struct Data_Type_Container
{
    int kind, n1, n2; /* Type and dimension(s).
*/
    void *data;
    char *variable_name, *type_name;
    struct Data_Type_Container *next, *child;
    int ref_count;
    } *DATA STRUCTURE DEFINITIONS=0;
```

typedef struct Data\_Type\_Container Envelope;





## **Copying Messages**

- There are two methods for copying (splitting) messages (data structures)
  - Pass a pointer (very fast)
  - Make a deep copy of the data structure (can be slow)
- Different models use one or the other approach (i.e. Junction uses pointers, Copy\_DS makes a deep copy)
- Deep copying may be required if both copies of the DS will be modified
- Pointer copying may be used if the copy is only being used as a trigger (for example)

#### Resources, Servers and Probes





- The properties of a Resource (i.e. CPU) are defined using a Set\_Resource device
- Many (i.e. hundreds) of Servers (i.e. Service\_wPriority\_Preemption) may be mapped to a single Resource
- An individual Server is often used to represent the execution of a particular piece of software
- The correlation between resources, servers and probes is set by the ResourceID attribute

- Up to four Probes (as shown) may be attached to a given Resource
- The Utilization probes output two files:
  - Batched and global utilization
- The other probes each output four files:
  - Batched and global average
  - Batched and global peak



## Examples

- "Histogram testcase"
  - Objective:
    - Need to run many Monte Carlo iterations of a simulation
    - Need to collect latency statistics (min, mean and max) for four point pairs (12 data points per iteration)
    - Need to identify the global min, mean and max for each
    - Need a histogram of the complete data set for one of the point pairs
    - Need to generate all required output fully automatically
  - Approach:
    - Use the Iterator to run iterations and collect min, mean and max
    - Use a separate "simulation" to (redundantly) collect min, mean and max
    - Use another separate "simulation" to assemble a global histogram
    - Tie together with several scripts
    - Demonstrate some "unusual" applications of a CSIM model



# Block Diagram of "hist\_test"





#### Using General Blocks as a Visual Programming Environment

This CSIM "model" reads four files (scatter plot data), calculates the min, mean and max values for each, and appends the results onto other files.



#### The Phases of Development



- Phase 1: Initial Model Development/Debug
  - Graphical display can be extremely valuable in facilitating verification and debug

#### Phase 2: Data Generation and Analysis

- Usually, data generation (i.e. Monte Carlo) is most effectively completed using automated, non-graphical methods
- Analysis of the collected data usually utilizes graphical methods (plotting, graphing, etc)

#### Phase 3: Results presentations/marketing

 Presentations to management/customers can benefit from attractive real-time graphical demos

*Careful organization of the model in the beginning will greatly benefit the eventual real-time graphical demos* 



## Starting A New Model

• To Start a new General-Blocks model: Include reference to GenBlocks model library

• File / Import by Reference

• \$CSIM\_MODEL\_LIBS/general\_blocks/Library.sim Begin drawing block diagrams

The main file to include is Library.sim
 Lists and categorizes all models
 Includes All.sim

• The All.sim file contains all the block models

### Tricks to Speed Development



- Stepwise, "build a little, test a little" process works best
- Recommended flow for a new model
  - Import the required libraries and define all known top level variables and macros
  - Identify a small, well understood chunk of functionality
    - Implement, simulate and debug, and verify that the simulation results are as expected
  - Add another chunk; repeat
- Always work with the smallest model possible; use stubs whenever appropriate
- Use pop ups, event probes, process timelines, etc. to help verify connectivity



## Running Simulations Faster (Summary)

- For the fastest simulation turnaround:
  - -Run nongraphically
  - -Compile with optimization (O2)
  - -Execute from the local /tmp directory
  - -Direct stdout and stderr into a file
  - -Run from the fastest machine available



#### Running Simulations Faster (Details)

- Graphical simulations will run slower than nongraphical
- A running graphical simulation will run faster
  - while animation is turned off
  - By increasing the time display increment (slightly)
  - By directing terminal output to a file (stdout & stderr)
- You can build a faster graphical simulation
  - By turning off debugging (removing -g from gcc cmd)
  - By turning on optimization (adding -O2 to gcc cmd)
  - By copying all files to the local /tmp directory and executing there



#### Running Simulations Faster (Details)

- Efficient simulations are *always* faster than inefficient simulations
  - Don't simulate anything that doesn't need to be simulated
    - Build times are proportional to the number of devices (boxes)
    - Simulation time is proportional to the number of deviceevents
  - Don't simulate a longer period than necessary
  - Don't simulate unnecessary details
  - Extraneous devices, inefficiently implemented models, etc. slow things down proportionately



# ugui

- Ugui is a tool developed to simplify the display of multi-file xgraph plots (an xgraph front end)
- Up to 16 files (data sets) may be combined into a single plot
- Each data set may have individual:
  - Colors
  - Line types
  - Point shapes
  - X and/or y shifts
  - X and/or Y scale factors
- Text and/or legend files can be included

| ile       | Data                  | Туре                                                                                                            | Color | Shape  |       | Lir    | les   | Тоо     | ls H      | – –<br>Help |  |
|-----------|-----------------------|-----------------------------------------------------------------------------------------------------------------|-------|--------|-------|--------|-------|---------|-----------|-------------|--|
|           | File Name             |                                                                                                                 | Туре  | XSoale | XShin | YBoale | YSNR  | Color   | Shape     | Line        |  |
| ./Probe_M | lodeD1_uti 1_Gavg.da  | it                                                                                                              | Plot  | 1.0    | -41   | 1.0    | 0.0   | ]t-gray | square    | point       |  |
| ./Probe_N | lodeDI_uti I_avg.dat  |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | lt-gray | circle    | point       |  |
| ./Probe_N | lodeD1_occ_GpK.dat    |                                                                                                                 | Plot  | 1.0    | - 42  | 0.5    | 0.0   | pink    | star      | poin        |  |
| ./Probe_N | lodeD1_occ_pk.dat     |                                                                                                                 | Plot  | 1.0    | 0.0   | 0.5    | 0.0   | pink    | star      | point       |  |
| ./Probe_N | lodeD1_occ_Gavg.dat   | 1                                                                                                               | Plot  | 1.0    | -42   | 0.5    | 0.0   | violet  | triangle  | poin        |  |
| ./Probe_N | lodeD1_occ_avg.dat    |                                                                                                                 | Plot  | 1.0    | 0.0   | 0.5    | 0.0   | violet  | triangle  | poin        |  |
| ./Probe_N | lødeD1_nesp_Gpk i dat | territa de la composition de la composi | Plot  | 1.0    | -42   | 1e3    | 0.0   | orange  | HhrG]ass  | poin        |  |
| ./Probe_N | lodeD1_resp_pk.dat    |                                                                                                                 | Plot  | 1.0    | 0.0   | le3    | 0.0   | orange  | invTAngle | poin        |  |
| ./Probe_N | lodeD1_uti1PP_G1o.d   | lat                                                                                                             | BarC  | 1.0    | 0.0   | 1.0    | - 100 | yellow  | square    | line        |  |
| filename  |                       |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | ſuchsia | square    | line        |  |
| filename  |                       |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | ſuchsia | square    | line        |  |
| filename  | filenawe              |                                                                                                                 | Pl₀t  | 1.0    | 0.0   | 1.0    | 0.0   | ſuchsia | square    | line        |  |
| filename  |                       |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | ſuchsia | square    | line        |  |
| filename  |                       |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | fuchsia | square    | line        |  |
| filename  | filename              |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | ſuchsia | square    | line        |  |
| filename  |                       |                                                                                                                 | Plot  | 1.0    | 0.0   | 1.0    | 0.0   | fuchsia | square    | line        |  |



# **Required Files**



- Files required locally for initial build:
  - yourSimFile.sim, data\_structs.txt, and (for now) soc\_lib.c
    - The Library.sim and/or All.sim files are referenced from \$CSIM\_ROOT/model\_libs/general\_blocks
    - Controlled local library files (i.e. IMA\_Lib.sim) are referenced from their repositories
- Build-created files
  - -sim.exe and top\_tab.dat are required
  - -out.c and netinfo are not required
  - -INTERMED\*.csim files indicate a problem
- Some models require input data files
  - Control\_Signal\_Generator, Arbitrary\_PulseTrain

# Required Files (cont)



- Ancillary tool-related files
  - -xgraph
    - Output data files from CSIM (\*.dat)
    - Optional annotations/labeling data (title.doc)
  - -tlpp (tlpp\_gui)
    - EventHist.dat, tlpp.com
  - -ugui
    - Setups saved in a \*.raw file
    - Generates an xgraph\_plot.com file

# A More Complex Example

- Consider the drawn distributed system
- A Sensor and two other subsystems are attached to a Core Switch
- The Core Switch connects to four Edge Switches
- Each Edge Switch connects to a number of Nodes
- We are interested in the Processor Utilization, Latencies and other performance metrics
- What do the Multi Core and Limited Threads Models do for us in analyzing this system?





### **Example System**

#### Top Level Block Diagram





### Details of Node\_2



# Details of Node\_2\_A\_Block



#### Detail of A2-A1a-1







#### **Router Attribute Menus**





#### **Application Attribute Menus**



### New Models



A New Capability in General Blocks-based modeling

 Models can be Dynamic and Self Configuring Models can more accurately represent the actual behavior of networks



#### Batcher/Unbatcher Models

- Generic\_Batcher and Generic\_UnBatcher
   Developed to extend and simplify the GenericVector set of models (GVCreate, GVLen, Access\_GVector, Setup\_GVElem)
- Typically used to represent a set of small messages combined into a single large message
- Generic\_Batcher combines messages
- Generic\_UnBatcher separates messages
- Enables accurate measurement of end to end latencies



### Admin Model

- The Admin is a scheduler, oriented to distributing periodic tasks among a group of processors in a networked environment:
- Operation:
  - A message, containing a task name, is sent to the Admin to request initiation of the task. The "tasks" are typically comparable to a sequence diagram.
  - The admin uses the specified algorithm (four are currently supported) to assign the task to a processor. It updates its status table.
  - The admin sends a message to the assigned processor to notify it to accept a task of the specified type and ID
  - The processor interprets the message and starts the task.
  - At the completion of a task, the processor sends a message to the Admin to report the task completion.
  - The Admin updates its status table.



### Admin Task Assignment

• A file (task\_table.dat) defines:

- Task names, processor names, scheduling algorithms and maximum task loading for each processor
- An example task table:

|      |        | CPU1 | CPU2 | CPU3 | CPU4 | CPU5 | CPU6 |
|------|--------|------|------|------|------|------|------|
| tsk1 | fill_u | 8    | 7    | 0    | 0    | 7    | 8    |
| tsk2 | fill_d | 5    | 0    | 0    | 0    | 0    | 5    |
| tsk3 | u_task | 3    | 0    | 4    | 4    | 0    | 3    |
| tsk4 | u_all  | 5    | 2    | 6    | 6    | 2    | 5    |



### Sender/Receiver

- The Sender and Receiver models use named synchrons to "wirelessly" send data structures between points
- One to one, one to many, many to one and many to many configurations can be supported
- Typically used to distribute control signals, alarms and triggers



#### **Router Model**

- Router has 16 bidirectional ports
  - Flexible specification of routing rules, i.e.
    - route\_ $24_50_20 = p5$
    - route\_24\_50\_20\_7 = p1
    - route\_DEFAULT = p2
    - route  $3_2_1_1_0_3_7up = p1$
    - route\_cabinet2\_card3\_cpu4 = p4
    - route\_24\_50\_F = p3
- Supports multicast publish, subscribe
  - multi\_w\_x\_y\_z =  $p1_p2_p3_p4$
- Supports dynamic subscribe/unsubscribe
  - subscribe\_24\_50\_20\_6 = p6\_p8
  - unsubscribe\_24\_50\_20\_1 = p13\_p14
- Can be used as data sorter/selector

### **Router Enablers**



- Used to create routing info:
  - -Append\_Route\_List
  - -Append\_String
  - -Num\_to\_String
  - -String\_to\_Num
- Used to split a full duplex link:
  - –PortConvert

# Multi-Core Processor Model

- Developed to more accurately represent the execution of modern multi-core processors
  - –Previous approaches (scaling application execution times) are inadequate
- Initially represented as ideal
- Currently, 1-16 cores can be represented
- Recently upgraded to include loaddependent performance degradation
- Speedup, as a function of active cores, specified as a table, i.e., Speedup\_2 = 1.5

# Multi Core Model



- There are several (relatively) independent capabilities lumped under the heading Multi Core Model –Multi Core Model (basic)
  - "Amdahl" Performance Degradation Capability
  - -Limited Thread Capability
  - -Multi-part Thread Capability
  - -"CoreLocked" Capability

### Multi Core Model (basic)



- The basic Multi Core Model enables the representation of multiple tasks executing simultaneously on a processor
- Model behavior:
  - Number of cores can be specified separately for all processors (i.e. MaxNumTasks)
  - -Any task can execute on any core
  - -No performance degredation modeled

### Multi Core Model (Amdahl)



The Amdahl model extends the basic model with generic performance degredation –Not actually hard-wired to Amdahl's model

- The speedup behavior is specified in a table-like set of attributes, i.e., A80 is: –Speedup 2 = 1.67
  - $-Speedup_2 = 1.67$  $-Speedup_3 = 2.14$  $-Speedup_4 = 2.5$
- Model behavior:

# Limited Thread Model



- The Limited Thread is an extension to the basic model
- The Multi Core model by itself does not specifically model software which is single threaded or otherwise limited
  - Handled indirectly through the specification of Speedup values
- If information regarding software limitations or system configuration are available, the Limited Thread Model can be used to specifically represent single threaded or otherwise limited threaded software
- The Multi Core Model and Limited Thread model are independent
- Operation of Limited Threads (LT)
  - Software which is limited is tagged with a set of attributes
  - When a message is received to initiate an execution, tags and status are checked; LTs may span multiple blocks
    - If this LT is not active, execution is started and the Active tag is set
    - If this LT is active, and this message does not represent the Active thread, the incoming message is queued; each identified LT has its own queue independent of all other queues
    - If this message represents the Active thread, it is started
    - When a thread exits, the Active tag is reset and the next thread is released

# Core Locked Model



- The Core Locked model enables the representation of threads (tasks, applications) that execute only on specific core(s)
- Any thread can be specified to execute on any or all of the available cores
- Specification of a thread (example):
  - -NumLimitThrds = 7
  - -Thread\_Name\_1 = DEF
  - $-Max_Threads_1 = 1$
  - $-Max_{Thrd}Prio_{1} = 3$
  - -Thrd\_Map\_1 = "1 3 4"



#### Simple Example of Multi Core and Limited Thread Operation

- This simple model will illustrate the behavior.
- Four identical sequences of tasks are indicated by the columns of six colored boxes.
- Boxes of the same color (row) are in the same LT group
- Each column begins execution quickly (0.01 milliseconds) after the column to its left.





#### Single Core Execution

- In the case of a Single Core, the task execution is fully sequential, from Block1 through Block24. • The last block
- completes execution at T=4.8 seconds Labe



Processing Time-Line Plot - Single Core

#### Ideal Four Core Execution

Labe



- In the case of the Ideal Four Core, the task execution is fully parallelThe last block
- completes execution at T=1.2 seconds



Processing Time-Line Plot - Four Core Ideal



#### Four Core, Amdahl = 0.8Execution

Labe

- In the Four Core, Amdahl 0.8 case, the task execution is fully parallel The Amdahl slowdown is •
- apparent The last block completes execution at
  - Speedup 4 = 2.5 for Amdahl 0.8, 4 core T=4.8/2.5 seconds T=1.92 seconds



#### **One Limited Thread Execution**



- In the One Limited Thread case, the second row of tasks, Block5 through Block8 (orange boxes) represent single threaded software
- Amdahl slowdown is turned off
- The last block completes execution at 1.8 seconds`





#### One Limited Thread, Amdahl 0.8 Execution

- In the One Limited Thread, Amdahl 0.8 case, the second row of tasks, Block5 through Block8 (orange boxes) represent single threaded software
- Amdahl slowdown is set for Amdahl 0.8
- Note the load dependent dilation
- The last block completes execution at 2.4 seconds





#### Service Models Dependencies



# SPP Model Capabilities



#### Original

- Priority
- Preemption
- Utilization
- Utilization Per Priority
- Occupancy
- Response

#### Added

- Process Timeline
- Self Test
- Multi Core (ideal)
- "Amdahl" Degradation
- Limited Thread
- Limited Thread Continuation
- Core Locking
- Utilization Per Core
- Utilization By Thread
- Utilization By Core
- Thread Queue Occupancy

#### **Recent CSIM Evolution**





### General Blocks Data Flow







#### Generic Development Flow

- Construct model
- Build simulation
- Execute simulation
- Review output data

