Solaris: IRQ and CPU affinity

Today I faced an interesting problem. To cut a long story short, Solaris 10, 05/09 release, seemed to assign network card IRQs in the following manner in a 16-way X4450 server I am testing:

# echo '::interrupts -d' | mdb -k | egrep 'CPU|e1000g'
IRQ Vector IPL Bus Type CPU Share APIC/INT# Driver Name(s)
51 0x60 6 MSI 9 1 - e1000g#0
52 0x61 6 MSI 12 1 - e1000g#2
53 0x62 6 MSI 13 1 - e1000g#1
54 0x63 6 MSI 14 1 - e1000g#3

While this seems nice, 1 CPU core per NIC, it’s not. psrinfo provides some further insight:

# psrinfo -pv
The physical processor has 4 virtual processors (0 4-6)
x86 (chipid 0x0 GenuineIntel family 6 model 15 step 11 clock 2933 MHz)
Intel(r) CPU @ 2.93GHz
The physical processor has 4 virtual processors (1 7-9)
x86 (chipid 0x2 GenuineIntel family 6 model 15 step 11 clock 2933 MHz)
Intel(r) CPU @ 2.93GHz
The physical processor has 4 virtual processors (2 10-12)
x86 (chipid 0x4 GenuineIntel family 6 model 15 step 11 clock 2933 MHz)
Intel(r) CPU @ 2.93GHz
The physical processor has 4 virtual processors (3 13-15)
x86 (chipid 0x6 GenuineIntel family 6 model 15 step 11 clock 2933 MHz)
Intel(r) CPU @ 2.93GHz

In short the default e1000g IRQ assignment seems to have a “preference” towards the last physical CPU. On my end I’d prefer a more balanced approach, something like getting each driver on a single core of a single CPU.

A number of webpage views later intrd was not available in my system, hacking the network driver was not a valid option (it was almost midnight; and besides installing a driver of my own in one of the largest mobile operators in Europe? give me a break) and I was finding myself cursing being unable to  SMP affinity for each IRQ on my own. I asked my Sun expert to help and almost gave up.

Then somehow I decided not to give up and run into the Solaris 10/09 What’s new notes and pcitool (which curiously enough doesn’t even its manpage at docs.sun.com). Party time!

  • Copy the SUNWio-tools package from a Solaris 10/09 DVD and install it


# cd /net/jumpstart/Sol10_U8_x86/Solaris_10/Product
# find SUNWio-tools | cpio -p -dum -v /var/tmp
# cd /var/tmp
# yes | pkgadd -d . SUNWio-tools

  • Figure out the PCI nexus where your e1000g cards are present


# cat /etc/path_to_inst | grep e1000
"/pci@0,0/pci8086,3605@2/pci8086,3500@0/pci8086,3518@2/pci108e,4836@0" 0 "e1000g"
"/pci@0,0/pci8086,3605@2/pci8086,3500@0/pci8086,3518@2/pci108e,4836@0,1" 1 "e1000g"
"/pci@0,0/pci8086,2690@1c/pci108e,4836@0" 2 "e1000g"
"/pci@0,0/pci8086,2690@1c/pci108e,4836@0,1" 3 "e1000g"

  • Figure out the “ino” for each e1000g card


# pcitool /pci@0,0 -i
[snip]
ino 60 mapped to cpu 9
Device: /pci@0,0/pci8086,3605@2/pci8086,3500@0/pci8086,3518@2/pci108e,4836@0
Driver: e1000g, instance 0

ino 61 mapped to cpu c
Device: /pci@0,0/pci8086,2690@1c/pci108e,4836@0
Driver: e1000g, instance 2

ino 62 mapped to cpu d
Device: /pci@0,0/pci8086,3605@2/pci8086,3500@0/pci8086,3518@2/pci108e,4836@0,1
Driver: e1000g, instance 1

ino 63 mapped to cpu e
Device: /pci@0,0/pci8086,2690@1c/pci108e,4836@0,1
Driver: e1000g, instance 3
[/snip]

  • Assign the desired interrupt to each device (in this example we go with 6, 9, 12, 15)


560 pcitool /pci@0,0 -i ino=60 -w cpu=6
561 pcitool /pci@0,0 -i ino=61 -w cpu=9
564 pcitool /pci@0,0 -i ino=62 -w cpu=c
565 pcitool /pci@0,0 -i ino=63 -w cpu=f

  • Profit!


# echo '::interrupts -d' | mdb -k | egrep 'CPU|e1000g'
IRQ Vector IPL Bus Type CPU Share APIC/INT# Driver Name(s)
51 0x60 6 MSI 6 1 - e1000g#0
52 0x61 6 MSI 9 1 - e1000g#2
53 0x62 6 MSI 12 1 - e1000g#1
54 0x63 6 MSI 15 1 - e1000g#3

Update: My Solaris go-to guy suggested the following:


eeprom(1M):

acpi-user-options=2
e1000g_intpt_bind_cpus=0,0,2,2

Not as generic and almost totally undocumented but it should do the trick.

Advertisements

Tags: , , , ,

3 Responses to “Solaris: IRQ and CPU affinity”

  1. Eric Enright Says:

    Cool article, thank you.

  2. Bjoern Henkel Says:

    This helped me a lot – thanks!

  3. Jason Holland Says:

    I found an article that states the reason they setup the IRQ being handled by a single core was that it allowed them to utilize the shared l2 cache

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: