I have deployed Java code on two different servers.The code is doing File Writing operations.
On the local server ,parameters are :
uname -a
SunOS snmi5001 5.10 Generic_120011-14 sun4u sparc SUNW,SPARC-Enterprise
ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 389296
coredump(blocks) unlimited
nofiles(descriptors) 20000
vmemory(kbytes) unlimited
Java Version:
java version "1.5.0_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_12-b04)
Java HotSpot(TM) Server VM (build 1.5.0_12-b04, mixed mode)
On a Different(lets say MIT) server :
uname -a
SunOS au11qapcwbtels2 5.10 Generic_147440-05 sun4u sparc SUNW,Sun-Fire-15000
ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
vmemory(kbytes) unlimited
java -version
java version "1.5.0_32"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_32-b05)
Java HotSpot(TM) Server VM (build 1.5.0_32-b05, mixed mode)
The problem is that the code is running signficatly slower on the MIT server.
Because of the difference in nofiles and stack for the two OS’s ,i thought if i change the ulimit -s and ulimit -n it would make a difference.
I cannot change the parameters on MIT server without confirming the problem,so the decreased the ulimit parameters for the local server and retested.But code finished execution is same time.
I have no idea what difference between the OS parameters which could be causing this.
Any help is appreciated.I will post more paramters if anyone tells me what to look for.
EDIT:
For MIT Server
No of CPU: psrinfo -p
24
psrinfo -pv
The physical processor has 2 virtual processors (0 4)
UltraSPARC-IV+ (portid 0 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (1 5)
UltraSPARC-IV+ (portid 1 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (2 6)
UltraSPARC-IV+ (portid 2 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (3 7)
UltraSPARC-IV+ (portid 3 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (32 36)
UltraSPARC-IV+ (portid 32 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (33 37)
UltraSPARC-IV+ (portid 33 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (34 38)
UltraSPARC-IV+ (portid 34 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (35 39)
UltraSPARC-IV+ (portid 35 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (64 68)
UltraSPARC-IV+ (portid 64 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (65 69)
UltraSPARC-IV+ (portid 65 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (66 70)
UltraSPARC-IV+ (portid 66 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (67 71)
UltraSPARC-IV+ (portid 67 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (96 100)
UltraSPARC-IV+ (portid 96 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (97 101)
UltraSPARC-IV+ (portid 97 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (98 102)
UltraSPARC-IV+ (portid 98 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (99 103)
UltraSPARC-IV+ (portid 99 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (128 132)
UltraSPARC-IV+ (portid 128 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (129 133)
UltraSPARC-IV+ (portid 129 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (130 134)
UltraSPARC-IV+ (portid 130 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (131 135)
UltraSPARC-IV+ (portid 131 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (224 228)
UltraSPARC-IV+ (portid 224 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (225 229)
UltraSPARC-IV+ (portid 225 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (226 230)
UltraSPARC-IV+ (portid 226 impl 0x19 ver 0x24 clock 1800 MHz)
The physical processor has 2 virtual processors (227 231)
UltraSPARC-IV+ (portid 227 impl 0x19 ver 0x24 clock 1800 MHz)
kstat cpu_info :
module: cpu_info instance: 231
name: cpu_info231 class: misc
brand UltraSPARC-IV+
chip_id 227
clock_MHz 1800
core_id 231
cpu_fru hc:///component=SB7
cpu_type sparcv9
crtime 587.102844985
current_clock_Hz 1799843256
device_ID 9223937394446500460
fpu_type sparcv9
implementation UltraSPARC-IV+ (portid 227 impl 0x19 ver 0x24 clock 1800 MHz)
pg_id 48
snaptime 19846866.5310415
state on-line
state_begin 1334854522
For the Local server i could only get the kstat info :
module: cpu_info instance: 0
name: cpu_info0 class: misc
brand SPARC64-VI
chip_id 1024
clock_MHz 2150
core_id 0
cpu_fru hc:///component=/MBU_A/CPUM0
cpu_type sparcv9
crtime 288.5675516
device_ID 250691889836161
fpu_type sparcv9
implementation SPARC64-VI (portid 1024 impl 0x6 ver 0x93 clock 2150 MHz)
snaptime 207506.8330168
state on-line
state_begin 1354493257
module: cpu_info instance: 1
name: cpu_info1 class: misc
brand SPARC64-VI
chip_id 1024
clock_MHz 2150
core_id 0
cpu_fru hc:///component=/MBU_A/CPUM0
cpu_type sparcv9
crtime 323.4572206
device_ID 250691889836161
fpu_type sparcv9
implementation SPARC64-VI (portid 1024 impl 0x6 ver 0x93 clock 2150 MHz)
snaptime 207506.8336113
state on-line
state_begin 1354493292
Similarly total 59 instances .
Also the memory for local server : vmstat
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s4 s1 in sy cs us sy id
0 0 0 143845984 93159232 431 895 1249 30 29 0 2 6 0 -0 1 3284 72450 6140 11 3 86
The memory for the MIT server : vmstat
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr m0 m1 m2 m3 in sy cs us sy id
0 0 0 180243376 184123896 81 786 248 15 15 0 0 3 14 -0 4 1854 7563 2072 1 1 98
df -h for MIT server:
Filesystem Size Used Available Capacity Mounted on
/dev/md/dsk/d0 7.9G 6.7G 1.1G 86% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 171G 1.7M 171G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
sharefs 0K 0K 0K 0% /etc/dfs/sharetab
/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap2.so.1
7.9G 6.7G 1.1G 86% /platform/sun4u-us3/lib/libc_psr.so.1
/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
7.9G 6.7G 1.1G 86% /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
/dev/md/dsk/d3 7.9G 6.6G 1.2G 85% /var
swap 6.0G 56K 6.0G 1% /tmp
swap 171G 40K 171G 1% /var/run
swap 171G 0K 171G 0% /dev/vx/dmp
swap 171G 0K 171G 0% /dev/vx/rdmp
/dev/md/dsk/d5 2.0G 393M 1.5G 21% /home
/dev/vx/dsk/appdg/oravl
2.0G 17M 2.0G 1% /ora
/dev/md/dsk/d60 1.9G 364M 1.5G 19% /apps/stats
/dev/md/dsk/d4 16G 2.1G 14G 14% /var/crash
/dev/md/dsk/d61 1005M 330M 594M 36% /opt/controlm6
/dev/vx/dsk/appdg/oraproductvl
10G 2.3G 7.6G 24% /ora/product
/dev/md/dsk/d63 963M 1.0M 904M 1% /var/opt/app
/dev/vx/dsk/dmldg/appsdmlsvtvl
1.0T 130G 887G 13% /apps/dml/svt
/dev/vx/dsk/appdg/homeappusersvl
20G 19G 645M 97% /home/app/users
/dev/vx/dsk/dmldg/appsdmlmit2vl
20G 66M 20G 1% /apps/dml/mit2
/dev/vx/dsk/dmldg/datadmlmit2vl
1.9T 1.1T 773G 61% /data/dml/mit2
/dev/md/dsk/d62 9.8G 30M 9.7G 1% /usr/openv/netbackup/logs
df -h for local server :
Filesystem Size Used Available Capacity Mounted on
/dev/dsk/c0t0d0s0 20G 7.7G 12G 40% /
/devices 0K 0K 0K 0% /devices
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 140G 1.6M 140G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
fd 0K 0K 0K 0% /dev/fd
/dev/dsk/c0t0d0s5 9.8G 9.3G 483M 96% /var
swap 140G 504K 140G 1% /tmp
swap 140G 80K 140G 1% /var/run
swap 140G 0K 140G 0% /dev/vx/dmp
swap 140G 0K 140G 0% /dev/vx/rdmp
/dev/dsk/c0t0d0s6 9.8G 9.4G 403M 96% /opt
/dev/vx/dsk/eva8k/tlkhome
2.0G 66M 1.8G 4% /tlkhome
/dev/vx/dsk/eva8k/tlkuser4
48G 26G 20G 57% /tlkuser4
/dev/vx/dsk/eva8k/ST82
1.1G 17M 999M 2% /ST_A_82
/dev/vx/dsk/eva8k/tlkuser11
37G 37G 176M 100% /tlkuser11
/dev/vx/dsk/eva8k/oravl97
20G 12G 7.3G 63% /oravl97
/dev/vx/dsk/eva8k/tlkuser5
32G 23G 8.3G 74% /tlkuser5
/dev/vx/dsk/eva8k/mbtlkproj1
2.0G 18M 1.9G 1% /mbtlkproj1
/dev/vx/dsk/eva8k/Oravol98
38G 25G 12G 68% /oravl98
/dev/vx/dsk/eva8k_new/tlkuser15
57G 57G 0K 100% /tlkuser15
/dev/vx/dsk/eva8k/Oravol1
39G 16G 22G 42% /oravl01
/dev/vx/dsk/eva8k/Oravol99
30G 8.3G 20G 30% /oravl99
/dev/vx/dsk/eva8k/tlkuser9
18G 13G 4.8G 73% /tlkuser9
/dev/vx/dsk/eva8k/oravl08
32G 25G 6.3G 81% /oravl08
/dev/vx/dsk/eva8k/oravl07
46G 45G 1.2G 98% /oravl07
/dev/vx/dsk/eva8k/Oravol3
103G 90G 13G 88% /oravl03
/dev/vx/dsk/eva8k_new/tlkuser12
79G 79G 0K 100% /tlkuser12
/dev/vx/dsk/eva8k/Oravol4
88G 83G 4.3G 96% /oravl04
/dev/vx/dsk/eva8k/oravl999
10G 401M 9.0G 5% /oravl999
/dev/vx/dsk/eva8k_new/tlkuser14
54G 39G 15G 73% /tlkuser14
/dev/vx/dsk/eva8k/Oravol2
85G 69G 14G 84% /oravl02
/dev/vx/dsk/eva8k/sdkhome
1.0G 17M 944M 2% /sdkhome
/dev/vx/dsk/eva8k/tlkuser7
44G 36G 7.8G 83% /tlkuser7
/dev/vx/dsk/eva8k/tlkproj1
1.0G 17M 944M 2% /tlkproj1
/dev/vx/dsk/eva8k/tlkuser3
35G 29G 5.9G 84% /tlkuser3
/dev/vx/dsk/eva8k/tlkuser10
29G 29G 2.7M 100% /tlkuser10
/dev/vx/dsk/eva8k/oravl05
30G 29G 1.2G 97% /oravl05
/dev/vx/dsk/eva8k/oravl06
36G 34G 1.6G 96% /oravl06
/dev/vx/dsk/eva8k/tlkuser6
29G 27G 2.1G 93% /tlkuser6
/dev/vx/dsk/eva8k/tlkuser2
36G 30G 5.8G 84% /tlkuser2
/dev/vx/dsk/eva8k/tlkuser1
66G 49G 16G 75% /tlkuser1
/dev/vx/dsk/eva8k_new/tlkuser13
84G 77G 7.0G 92% /tlkuser13
/dev/vx/dsk/eva8k_new/tlkuser16
44G 37G 6.4G 86% /tlkuser16
/dev/vx/dsk/eva8k/db2
1.0G 593M 404M 60% /opt/db2V8.1
/dev/vx/dsk/eva8k/WebSphere6029
3.0G 2.2G 776M 75% /opt/WebSphere6029
/dev/vx/dsk/eva8k/websphere6
2.0G 88M 1.8G 5% /opt/websphere6
/dev/vx/dsk/eva8k/wli
4.0G 1.4G 2.5G 36% /opt/wli10gR3MP1
/dev/vx/dsk/eva8k/user
2.0G 19M 1.9G 1% /user/telstra/history
dvcinasdm3:/oracle_cdrom/data
576G 576G 206M 100% /oracle_cdrom
dvcinasdm2:/system_kits
822G 818G 4.2G 100% /system_kits
dvcinasdm2:/db_share 295G 283G 13G 96% /db_share
dvcinas2dm2:/system_data/data
315G 283G 32G 90% /system_data
dvcinas2dm2:/ossinfra/data
49G 18G 32G 36% /ossinfra
For local server the command : /usr/sbin/prtpicl -v | egrep "devfs-path|driver-name|subsystem-id" | nawk '/:subsystem-id/ { print $0; getline; print $0; getline; print $0; }' | nawk -F: '{ print $2 }' gives :
subsystem-id 0x13a1
devfs-path /pci@0,600000/pci@0/pci@8/pci@0/scsi@1
driver-name mpt
subsystem-id 0x1648
devfs-path /pci@0,600000/pci@0/pci@8/pci@0/network@2
driver-name bge
subsystem-id 0x1648
devfs-path /pci@0,600000/pci@0/pci@8/pci@0/network@2,1
driver-name bge
subsystem-id 0xfc11
devfs-path /pci@0,600000/pci@0/pci@8/pci@0,1/SUNW,emlxs@1
driver-name emlxs
subsystem-id 0x125e
devfs-path /pci@3,700000/network
driver-name e1000g
subsystem-id 0x125e
devfs-path /pci@3,700000/network
driver-name e1000g
subsystem-id 0x13a1
devfs-path /pci@10,600000/pci@0/pci@8/pci@0/scsi@1
driver-name mpt
subsystem-id 0x1648
devfs-path /pci@10,600000/pci@0/pci@8/pci@0/network
driver-name bge
subsystem-id 0x1648
devfs-path /pci@10,600000/pci@0/pci@8/pci@0/network
driver-name bge
subsystem-id 0xfc11
devfs-path /pci@10,600000/pci@0/pci@8/pci@0,1/SUNW,emlxs@1
driver-name emlxs
For MIT server it gives :
subsystem-id 0xfc00
devfs-path /pci@3d,600000/SUNW,emlxs@1
driver-name emlxs
subsystem-id 0xfc00
devfs-path /pci@3d,600000/SUNW,emlxs@1,1
driver-name emlxs
subsystem-id 0xfc00
devfs-path /pci@5d,600000/SUNW,emlxs@1
driver-name emlxs
subsystem-id 0xfc00
devfs-path /pci@5d,600000/SUNW,emlxs@1,1
driver-name emlxs
on the start of i/o consuming code,iostat -d c3t50001FE1502613A9d7 5 shows :
1161 37 134 0 0 0 0 0 0 329 24 2
3 2 3 0 0 0 0 0 0 554 71 10
195 26 6 0 0 0 0 0 0 853 108 19
37 6 4 0 0 0 0 0 0 1134 143 10
140 8 7 0 0 0 0 0 0 3689 86 7
173 24 85 0 0 0 0 0 0 9914 74 9
0 0 0 0 0 0 0 0 0 12323 114 2
13 9 41 0 0 0 0 0 0 10609 117 2
0 0 0 0 0 0 0 0 0 10746 72 2
sd0 sd1 sd4 ssd134
kps tps serv kps tps serv kps tps serv kps tps serv
1 0 3 0 0 0 0 0 0 11376 137 2
2 0 10 0 0 0 0 0 0 11980 157 3
231 39 14 0 0 0 0 0 0 10584 140 3
785 175 5 0 0 0 0 0 0 13503 170 2
9 4 32 0 0 0 0 0 0 11597 168 2
7 1 6 0 0 0 0 0 0 11555 106 2
On the MIT server iostat shows :
0.0 460.4 0.0 4029.2 0.4 0.6 0.9 1.2 2 11 c6t5006048452A79BD6d206
0.0 885.2 0.0 8349.3 0.5 0.8 0.6 0.9 3 24 c4t5006048452A79BD9d206
0.0 660.0 0.0 5618.8 0.5 0.7 0.7 1.0 2 18 c6t5006048452A79BD6d206
0.0 779.1 0.0 7408.6 0.3 0.7 0.4 0.8 2 21 c4t5006048452A79BD9d206
0.0 569.8 0.0 4893.9 0.3 0.5 0.5 1.0 2 15 c6t5006048452A79BD6d206
0.0 521.5 0.0 5433.6 0.2 0.5 0.3 0.9 1 16 c4t5006048452A79BD9d206
0.0 362.8 0.0 3134.8 0.2 0.4 0.6 1.1 1 10 c6t5006048452A79BD6d206
So,we can see that the kps for local server is much more than that of MIT server,during the time of max i/o operations.
Conclusions on the local and MIT server
A quick glance at your machines:
Theoretical performance problems
The performance of your application may be gated on one of several bottlenecks (assuming no code problems and network latencies/bottlenecks):
Looking at your issue, we can do a quick check:
iostat -d <disk> 5. The throughput value and number of operations/sec will be higher on the local server, and lower on the MIT serverAll the above is assuming that other processes on the servers are not interfering with the operation of your program – clearly another high-cpu process or one writing a lot to the same disk will affect the performance greatly.
Conclusions
From the CPU data you provide, there is no evidence of a CPU bottleneck.
From the
iostatdata you provide, as you comment, the IO on the SunFire is significantly below that of the local server. This is likely the result of the attached storage, namely at least one of:(Note that the same SAN appears connected to the local server, so this could be tested).
With clear evidence of a hardware being the cause of the performance difference, there is little that can be done.
Some things may improve the general performance of the application, though. It’s a good idea to run a Java profiler on the application. Examples include Netbeans and JProfiler.
The profiler will identify which IO operations are the problem. You might be able to:
EDIT: Thoughts on a caching layer
Assumption: That the problematic IO operation is either repeatedly writing small chunks to disk and flushing them, or keeps performing random-access write-to-disk operations. Your application may already be streaming to disk efficiently, in which case caching would not be useful.
When you have an expensive or slow operation in an application, you will want to minimize the number of times it is invoked – ideally to the theoretical minimum which hopefully is 1. However your code may not be doing so – for example you are using an OutputStream and writing small chunks to it and flushing to disk. In this case, you may write each disk block (8k) many times, each time with just a little more data.
Instead, you could use a RAM cache to consolidate all the writes; when you know there will be no more writes to the block, then you write it exactly once to disk. For streaming, Java has the BufferedOutputStream for this for simplistic cases. When you obtain the
FileOutputStreaminstance from theFile, wrap theFileOutputStreamin theBufferedOutputStreamand use only theBufferedOutputStream.If, however, you are performing true random-access writes (eg using a
java.io.RandomAccessFile), and moving the file pointer withRandomAccessFile.seek(), you may want to consider writing a write cache in RAM. Precisely what this would look like depends wholly on your file data structure, but you might want to start with a block paging mechanism. Chapter 1 of Java NIO has an introduction to those concepts, but hopefully you either don’t need to go there or you find a close match in the NIO API.