0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
a 1010
b 1011
c 1100
d 1101
e 1110
f 1111



address of 1st element of SP = FFFEh
element size = W = 2B
indefinite filling of Stack = Stack overflow (overwritten memory locations of program, but not of the OS)

Heap area = Dynamic Memory = optional zone, where during runtime, the programmer, through instructions, temporarily allocates some memory for variables whose dimension can only be verified during execution (eg the size of an input string). Its size is not predetermined and can also be allocated and deallocated several times during runtime. The management of the area of heap is obtained through code area instructions.

Stack area = memory zone handled automatically by Compilers. By programmer instructions the compilers manage the area of stack transparently to the programmer, allocating and deallocating the local variables and parameters passed to the procedures. Only in the Programming in Assembly it's possible directly handle the stack area with appropriate instructions.



MC includes (CPU=MP) + ((cache level 2 (greater and faster than cache level 1 included in MP))
HW <--( BIOS = SW normally contained in ROM or other non-volatile memory) --> SW
Clock - clock frequency - is the number of switches 0 and 1 that circuits in MP, normally an instruction needs of more clock cycles. Quartz Oscillator which is inside the cpu=mp and can be controlled via BIOS
CMOS complementary metal-oxide semiconductor, microchips that maintains hardware and configuration settings by the power of a buffer battery
cache level 1 (I-cache = instruction cache + D-cahe = data cache) is inside MP
cache level 2 , greater and faster than L1, is in MC but extern and near at MP
BIU bus interface unit , input of info in processor, duplicates infos and send to cache L1 (I-cache and D-cache) and to Cache L2
Fetch decode unit , fetches instructions from I-Cache ,
BTB branch target Buffer, compares every instruction with a record of another buffer to verify if this instruction already is already used








AX BX CX DX to store. Arithmetical registers
SP BP DI SI to access to memory




to address memory spaces
memory = Segments
address = Segment:Offset
the segment = 16 bit + 0000 at right that multiply by 16)
eg : [CS] = 123Ah , [IP] = 341Bh
SEG:OFF = 123A0:341B = 123a0+341b=157BBh = physical address


from the top to bottom:
effective address
segment address
physical address


hexadecimal addresses, from the top to bottom:
first segment
2nd segment
3rd segment

FLAGS are indicator bits (grouped into a status log named PSW register) normally read by conditioned jump instructions
OF: overflow indicator. It is setted at 1 when the result of one addition or subtraction (with sign ) causes overflow
SF: sign indicator. It is setted at 1 when the result of a logic-arithmetic operation is a negative number (= MSB of the result)
ZF: zero indicator. It is setted at 1 when the result of a logic-arithmetic operation is = zero
CF: setted at 1 when a logic-arithmetic operation gives a rest (indicates overflow in case of numbers without sign)










PIC








INTEL

16-bit Processors and Segmentation (1978)
The IA-32 architecture family was preceded by 16-bit processors, the 8086 and 8088. The 8086 has 16-bit registers and a 16-bit external data bus, with 20-bit addressing giving a 1-MByte address space. The 8088 is similar to the 8086 except it has an 8-bit external data bus.
The 8086/8088 introduced segmentation to the IA-32 architecture. With segmentation, a 16-bit segment register contains a pointer to a memory segment of up to 64 KBytes. Using four segment registers at a time, 8086/8088 processors are able to address up to 256 KBytes without switching between segments. The 20-bit addresses that can be formed using a segment register and an additional 16-bit pointer provide a total address range of 1 MByte.



The Intel 286 Processor (1982)
The Intel 286 processor introduced protected mode operation into the IA-32 architecture. Protected mode uses the segment register content as selectors or pointers into descriptor tables. Descriptors provide 24-bit base addresses with a physical memory size of up to 16 MBytes, support for virtual memory management on a segment swapping basis, and a number of protection mechanisms. These mechanisms include:
• Segment limit checking
• Read-only and execute-only segment options
• Four privilege levels



The Intel386 Processor (1985)
The Intel386 processor was the first 32-bit processor in the IA-32 architecture family. It introduced 32-bit registers for use both to hold operands and for addressing. The lower half of each 32-bit Intel386 register retains the properties of the 16-bit registers of earlier generations, permitting backward compatibility. The processor also provides a virtual-8086 mode that allows for even greater efficiency when executing programs created for 8086/8088 processors.
In addition, the Intel386 processor has support for:
• A 32-bit address bus that supports up to 4-GBytes of physical memory
• A segmented-memory model and a flat memory model
• Paging, with a fixed 4-KByte page size providing a method for virtual memory management
• Support for parallel stages



The Intel486 Processor (1989)
The Intel486 processor added more parallel execution capability by expanding the Intel386 processor’s instruction decode and execution units into five pipelined stages. Each stage operates in parallel with the others on up to five instructions in different stages of execution.
In addition, the processor added:
• An 8-KByte on-chip first-level cache that increased the percent of instructions that could execute at the scalar rate of one per clock
• An integrated x87 FPU
• Power saving and system management capabilities



The Intel Pentium Processor (1993) The introduction of the Intel Pentium processor added a second execution pipeline to achieve superscalar performance (two pipelines, known as u and v, together can execute two instructions per clock). The on-chip first-level cache doubled, with 8 KBytes devoted to code and another 8 KBytes devoted to data. The data cache uses the MESI protocol to support more efficient write-back cache in addition to the write-through cache previously used by the Intel486 processor. Branch prediction with an on-chip branch table was added to increase performance in looping constructs. In addition, the processor added: • Extensions to make the virtual-8086 mode more efficient and allow for 4-MByte as well as 4-KByte pages • Internal data paths of 128 and 256 bits add speed to internal data transfers • Burstable external data bus was increased to 64 bits • An APIC to support systems with multiple processors • A dual processor mode to support glueless two processor systems A subsequent stepping of the Pentium family introduced Intel MMX technology (the Pentium Processor with MMX technology). Intel MMX technology uses the single-instruction, multiple-data (SIMD) execution model to perform parallel computations on packed integer data contained in 64-bit registers. See Section 2.2.7, “SIMD Instructions.” 2.1.6 The P6 Family of Processors (1995-1999) The P6 family of processors was based on a superscalar microarchitecture that set new performance standards; see also Section 2.2.1, “P6 Family Microarchitecture.” One of the goals in the design of the P6 family microarchitecture was to exceed the performance of the Pentium processor significantly while using the same 0.6-micrometer, fourlayer, metal BICMOS manufacturing process. Members of this family include the following: • The Intel Pentium Pro processor is three-way superscalar. Using parallel processing techniques, the processor is able on average to decode, dispatch, and complete execution of (retire) three instructions per clock cycle. The Pentium Pro introduced the dynamic execution (micro-data flow analysis, out-of-order execution, superior branch prediction, and speculative execution) in a superscalar implementation. The processor was further enhanced by its caches. It has the same two on-chip 8-KByte 1st-Level caches as the Pentium processor and an additional 256-KByte Level 2 cache in the same package as the processor. • The Intel Pentium II processor added Intel MMX technology to the P6 family processors along with new packaging and several hardware enhancements. The processor core is packaged in the single edge contact cartridge (SECC). The Level l data and instruction caches were enlarged to 16 KBytes each, and Level 2 cache sizes of 256 KBytes, 512 KBytes, and 1 MByte are supported. A half-frequency backside bus connects the Level 2 cache to the processor. Multiple low-power states such as AutoHALT, Stop-Grant, Sleep, and Deep Sleep are supported to conserve power when idling. • The Pentium II Xeon processor combined the premium characteristics of previous generations of Intel processors. This includes: 4-way, 8-way (and up) scalability and a 2 MByte 2nd-Level cache running on a fullfrequency backside bus.



INTEL 64 AND IA-32 ARCHITECTURES
• The Intel Celeron processor family focused on the value PC market segment. Its introduction offers an integrated 128 KBytes of Level 2 cache and a plastic pin grid array (P.P.G.A.) form factor to lower system design cost.
• The Intel Pentium III processor introduced the Streaming SIMD Extensions (SSE) to the IA-32 architecture. SSE extensions expand the SIMD execution model introduced with the Intel MMX technology by providing a new set of 128-bit registers and the ability to perform SIMD operations on packed single-precision floatingpoint values. See Section 2.2.7, “SIMD Instructions.”
• The Pentium III Xeon processor extended the performance levels of the IA-32 processors with the enhancement of a full-speed, on-die, and Advanced Transfer Cache.

The Intel Pentium 4 Processor Family (2000-2006)
The Intel Pentium 4 processor family is based on Intel NetBurst microarchitecture;
The Intel Pentium 4 processor introduced Streaming SIMD Extensions 2 (SSE2);
The Intel Pentium 4 processor 3.40 GHz, supporting Hyper-Threading Technology introduced Streaming SIMD Extensions 3 (SSE3); see Section 2.2.7, “SIMD Instructions.” Intel 64 architecture was introduced in the Intel Pentium 4 Processor Extreme Edition supporting Hyper-Threading Technology and in the Intel Pentium 4 Processor 6xx and 5xx sequences. Intel Virtualization Technology (Intel® VT) was introduced in the Intel Pentium 4 processor 672 and 662. 2.1.8 The Intel® Xeon® Processor (2001- 2007) Intel Xeon processors (with exception for dual-core Intel Xeon processor LV, Intel Xeon processor 5100 series) are based on the Intel NetBurst microarchitecture; see Section 2.2.2, “Intel NetBurst® Microarchitecture.” As a family, this group of IA-32 processors (more recently Intel 64 processors) is designed for use in multi-processor server systems and high-performance workstations. The Intel Xeon processor MP introduced support for Intel® Hyper-Threading Technology; see Section 2.2.8, “Intel® Hyper-Threading Technology.” The 64-bit Intel Xeon processor 3.60 GHz (with an 800 MHz System Bus) was used to introduce Intel 64 architecture. The Dual-Core Intel Xeon processor includes dual core technology. The Intel Xeon processor 70xx series includes Intel Virtualization Technology. The Intel Xeon processor 5100 series introduces power-efficient, high performance Intel Core microarchitecture. This processor is based on Intel 64 architecture; it includes Intel Virtualization Technology and dual-core technology. The Intel Xeon processor 3000 series are also based on Intel Core microarchitecture. The Intel Xeon processor 5300 series introduces four processor cores in a physical package, they are also based on Intel Core microarchitecture.



The Intel Pentium M Processor (2003-2006) The Intel Pentium M processor family is a high performance, low power mobile processor family with microarchitectural enhancements over previous generations of IA-32 Intel mobile processors. This family is designed for extending battery life and seamless integration with platform innovations that enable new usage models (such as extended mobility, ultra thin form-factors, and integrated wireless networking). Its enhanced microarchitecture includes: • Support for Intel Architecture with Dynamic Execution • A high performance, low-power core manufactured using Intel’s advanced process technology with copper interconnect • On-die, primary 32-KByte instruction cache and 32-KByte write-back data cache • On-die, second-level cache (up to 2 MByte) with Advanced Transfer Cache Architecture



INTEL 64 AND IA-32 ARCHITECTURES
• Advanced Branch Prediction and Data Prefetch Logic
• Support for MMX technology, Streaming SIMD instructions, and the SSE2 instruction set
• A 400 or 533 MHz, Source-Synchronous Processor System Bus
• Advanced power management using Enhanced Intel SpeedStep technology
2.1.10 The Intel Pentium Processor Extreme Edition (2005)
The Intel Pentium processor Extreme Edition introduced dual-core technology. This technology provides advanced hardware multi-threading support. The processor is based on Intel NetBurst microarchitecture and supports SSE, SSE2, SSE3, Hyper-Threading Technology, and Intel 64 architecture.

The Intel Core Duo and Intel Core Solo Processors (2006-2007)
The Intel Core Duo processor offers power-efficient, dual-core performance with a low-power design that extends battery life. This family and the single-core Intel Core Solo processor offer microarchitectural enhancements over Pentium M processor family. Its enhanced microarchitecture includes:
• Intel® Smart Cache which allows for efficient data sharing between two processor cores
• Improved decoding and SIMD execution
• Intel® Dynamic Power Coordination and Enhanced Intel® Deeper Sleep to reduce power consumption
• Intel® Advanced Thermal Manager which features digital thermal sensor interfaces
• Support for power-optimized 667 MHz bus
The dual-core Intel Xeon processor LV is based on the same microarchitecture as Intel Core Duo processor, and supports IA-32 architecture.



The Intel Xeon Processor 5100, 5300 Series and Intel Core 2 Processor Family (2006)
The Intel Xeon processor 3000, 3200, 5100, 5300, and 7300 series, Intel Pentium Dual-Core, Intel Core 2 Extreme,
Intel Core 2 Quad processors, and Intel Core 2 Duo processor family support Intel 64 architecture; they are based on the high-performance, power-efficient Intel® Core microarchitecture built on 65 nm process technology. The
Intel Core microarchitecture includes the following innovative features:
• Intel Wide Dynamic Execution to increase performance and execution throughput
• Intel Intelligent Power Capability to reduce power consumption
• Intel Advanced Smart Cache which allows for efficient data sharing between two processor cores
• Intel Smart Memory Access to increase data bandwidth and hide latency of memory accesses
• Intel Advanced Digital Media Boost which improves application performance using multiple generations of
Streaming SIMD extensions
The Intel Xeon processor 5300 series, Intel Core 2 Extreme processor QX6800 series, and Intel Core 2 Quad processors support Intel quad-core technology.



INTEL 64 AND IA-32 ARCHITECTURES The Intel Xeon Processor 5200, 5400, 7400 Series and Intel® Core™2 Processor Family (2007) The Intel Xeon processor 5200, 5400, and 7400 series, Intel Core 2 Quad processor Q9000 Series, Intel Core 2 Duo processor E8000 series support Intel 64 architecture; they are based on the Enhanced Intel® Core microarchitecture using 45 nm process technology. The Enhanced Intel Core microarchitecture provides the following improved features: • A radix-16 divider, faster OS primitives further increases the performance of Intel® Wide Dynamic Execution. • Improves Intel® Advanced Smart Cache with Up to 50% larger level-two cache and up to 50% increase in wayset associativity. • A 128-bit shuffler engine significantly improves the performance of Intel® Advanced Digital Media Boost and SSE4. Intel Xeon processor 5400 series and Intel Core 2 Quad processor Q9000 Series support Intel quad-core technology. Intel Xeon processor 7400 series offers up to six processor cores and an L3 cache up to 16 MBytes. 2.1.14 The Intel® Atom™ Processor Family (2008) The first generation of Intel® AtomTM processors are built on 45 nm process technology. They are based on a new microarchitecture, Intel® AtomTM microarchitecture, which is optimized for ultra low power devices. The Intel® AtomTM microarchitecture features two in-order execution pipelines that minimize power consumption, increase battery life, and enable ultra-small form factors. The initial Intel Atom Processor family and subsequent generations including Intel Atom processor D2000, N2000, E2000, Z2000, C1000 series provide the following features: • Enhanced Intel® SpeedStep® Technology • Intel® Hyper-Threading Technology • Deep Power Down Technology with Dynamic Cache Sizing • Support for instruction set extensions up to and including Supplemental Streaming SIMD Extensions 3 (SSSE3). • Support for Intel® Virtualization Technology • Support for Intel® 64 Architecture (excluding Intel Atom processor Z5xx Series) 2.1.15 The Intel® Atom™ Processor Family Based on Silvermont Microarchitecture (2013) Intel Atom Processor C2xxx, E3xxx, S1xxx series are based on the Silvermont microarchitecture. Processors based on the Silvermont microarchitecture supports instruction set extensions up to and including SSE4.2, AESNI, and PCLMULQDQ. 2.1.16 The Intel® Core™i7 Processor Family (2008) The Intel Core i7 processor 900 series support Intel 64 architecture; they are based on Intel® microarchitecture code name Nehalem using 45 nm process technology. The Intel Core i7 processor and Intel Xeon processor 5500 series include the following innovative features: • Intel® Turbo Boost Technology converts thermal headroom into higher performance. • Intel® HyperThreading Technology in conjunction with Quadcore to provide four cores and eight threads. • Dedicated power control unit to reduce active and idle power consumption. • Integrated memory controller on the processor supporting three channel of DDR3 memory. • 8 MB inclusive Intel® Smart Cache. • Intel® QuickPath interconnect (QPI) providing point-to-point link to chipset. • Support for SSE4.2 and SSE4.1 instruction sets. • Second generation Intel Virtualization Technology.



INTEL 64 AND IA-32 ARCHITECTURES 2.1.17 The Intel Xeon Processor 7500 Series (2010) The Intel Xeon processor 7500 and 6500 series are based on Intel microarchitecture code name Nehalem using 45 nm process technology. They support the same features described in Section 2.1.16, plus the following innovative features: • Up to eight cores per physical processor package. • Up to 24 MB inclusive Intel® Smart Cache. • Provides Intel® Scalable Memory Interconnect (Intel® SMI) channels with Intel® 7500 Scalable Memory Buffer to connect to system memory. • Advanced RAS supporting software recoverable machine check architecture. 2.1.18 2010 Intel® Core™ Processor Family (2010) 2010 Intel Core processor family spans Intel Core i7, i5 and i3 processors. They are based on Intel® microarchitecture code name Westmere using 32 nm process technology. The innovative features can include: • Deliver smart performance using Intel Hyper-Threading Technology plus Intel Turbo Boost Technology. • Enhanced Intel Smart Cache and integrated memory controller. • Intelligent power gating. • Repartitioned platform with on-die integration of 45 nm integrated graphics. • Range of instruction set support up to AESNI, PCLMULQDQ, SSE4.2 and SSE4.1. 2.1.19 The Intel® Xeon® Processor 5600 Series (2010) The Intel Xeon processor 5600 series are based on Intel microarchitecture code name Westmere using 32 nm process technology. They support the same features described in Section 2.1.16, plus the following innovative features: • Up to six cores per physical processor package. • Up to 12 MB enhanced Intel® Smart Cache. • Support for AESNI, PCLMULQDQ, SSE4.2 and SSE4.1 instruction sets. • Flexible Intel Virtualization Technologies across processor and I/O. 2.1.20 The Second Generation Intel® Core™ Processor Family (2011) The Second Generation Intel Core processor family spans Intel Core i7, i5 and i3 processors based on the Sandy Bridge microarchitecture. They are built from 32 nm process technology and have innovative features including: • Intel Turbo Boost Technology for Intel Core i5 and i7 processors • Intel Hyper-Threading Technology. • Enhanced Intel Smart Cache and integrated memory controller. • Processor graphics and built-in visual features like Intel® Quick Sync Video, Intel® InsiderTM etc. • Range of instruction set support up to AVX, AESNI, PCLMULQDQ, SSE4.2 and SSE4.1. Intel Xeon processor E3-1200 product family is also based on the Sandy Bridge microarchitecture. Intel Xeon processor E5-2400/1400 product families are based on the Sandy Bridge-EP microarchitecture. Intel Xeon processor E5-4600/2600/1600 product families are based on the Sandy Bridge-EP microarchitecture and provide support for multiple sockets.



INTEL 64 AND IA-32 ARCHITECTURES The Third Generation Intel Core Processor Family (2012) The Third Generation Intel Core processor family spans Intel Core i7, i5 and i3 processors based on the Ivy Bridge microarchitecture. The Intel Xeon processor E7-8800/4800/2800 v2 product families and Intel Xeon processor E3- 1200 v2 product family are also based on the Ivy Bridge microarchitecture. The Intel Xeon processor E5-2400/1400 v2 product families are based on the Ivy Bridge-EP microarchitecture.
The Intel Xeon processor E5-4600/2600/1600 v2 product families are based on the Ivy Bridge-EP microarchitecture and provide support for multiple sockets.

The Fourth Generation Intel Core Processor Family (2013)
The Fourth Generation Intel Core processor family spans Intel Core i7, i5 and i3 processors based on the Haswell
microarchitecture. Intel Xeon processor E3-1200 v3 product family is also based on the Haswell microarchitecture.

MORE ON SPECIFIC ADVANCES The following sections provide more information on major innovations.

P6 Family Microarchitecture
The Pentium Pro processor introduced a new microarchitecture commonly referred to as P6 processor microarchitecture.
The P6 processor microarchitecture was later enhanced with an on-die, Level 2 cache, called Advanced Transfer Cache.
The microarchitecture is a three-way superscalar, pipelined architecture. Three-way superscalar means that by using parallel processing techniques, the processor is able on average to decode, dispatch, and complete execution of (retire) three instructions per clock cycle. To handle this level of instruction throughput, the P6 processor family uses a decoupled, 12-stage superpipeline that supports out-of-order instruction execution.
next figure shows a conceptual view of the P6 processor microarchitecture pipeline with the Advanced Transfer Cache enhancement. To ensure a steady supply of instructions and data for the instruction execution pipeline, the P6 processor microarchitecture incorporates two cache levels. The Level 1 cache provides an 8-KByte instruction cache and an 8-KByte data cache, both closely coupled to the pipeline. The Level 2 cache provides 256-KByte, 512-KByte, or 1-MByte static RAM that is coupled to the core processor through a full clock-speed 64-bit cache bus. The centerpiece of the P6 processor microarchitecture is an out-of-order execution mechanism called dynamic execution. Dynamic execution incorporates three data-processing concepts:
• Deep branch prediction allows the processor to decode instructions beyond branches to keep the instruction pipeline full. The P6 processor family implements highly optimized branch prediction algorithms to predict the direction of the instruction.
• Dynamic data flow analysis requires real-time analysis of the flow of data through the processor to determine dependencies and to detect opportunities for out-of-order instruction execution. The out-of-order execution core can monitor many instructions and execute these instructions in the order that best optimizes the use of the processor’s multiple execution units, while maintaining the data integrity.
• Speculative execution refers to the processor’s ability to execute instructions that lie beyond a conditional branch that has not yet been resolved, and ultimately to commit the results in the order of the original instruction stream. To make speculative execution possible, the P6 processor microarchitecture decouples the dispatch and execution of instructions from the commitment of results. The processor’s out-of-order execution core uses data-flow analysis to execute all available instructions in the instruction pool and store the results in temporary registers. The retirement unit then linearly searches the instruction pool for completed instructions that no longer have data dependencies with other instructions or unresolved branch predictions. When completed instructions are found, the retirement unit commits the results of these instructions to memory and/or the IA-32 registers (the processor’s eight general-purpose registers and eight x87 FPU data registers) in the order they were originally issued and retires the instructions from the instruction pool.







Home Page