Procesy opróżniania zużywają zbyt dużo procesora

9

Serwer to Instancja EC2, oznacza zapisywanie plików na NAS (NFS) z HTTPD.

Procesy takie jak flush-0: 32 zużywają ponad 90% procesora i średnio obciążają: 65,50, 64,02, 66,59.

Zgodnie z wykresem rośnie każdego dnia, podczas gdy średnia początkowa wartość obciążenia wynosiła ~ 1,01, 2,02, 1,80 na 4 rdzeniach. Dodałem kolejną podobną instancję w module równoważenia obciążenia, a jej wykorzystanie procesora wynosi tylko około% 6 ATM.

Co dokładnie robią te procesy spłukiwania?

Może powinniśmy wyłączyć pamięć podręczną atrybutów NFS, jeśli klienci potrzebują tylko zapisywać dane?

Czy może to wynikać z fragmentacji pakietów?

Oto kilka statystyk nfsstat -s -4:

=================================================================
Server 0:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
715054137   0          0          0          0       

Server nfs v4:
null         compound     
993       0% 715053143 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 143229323  6% 78092765  3% 36693816  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
3486926   0% 0         0% 0         0% 679872421 28% 158406682  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 95872524  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
78173920  3% 0         0% 46442107  1% 1668      0% 715044032 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
7110      0% 42081145  1% 9116904   0% 0         0% 7026      0% 1257      0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
14        0% 81591622  3% 81659244  3% 0         0% 21028018  0% 3244      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3244      0% 0         0% 114086560  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 1:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
307172153   0          0          0          0       

Server nfs v4:
null         compound     
427       0% 307171725 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 58451998  5% 32717934  3% 15557564  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
1424670   0% 0         0% 0         0% 291829363 28% 67959378  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 41790934  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
32741459  3% 0         0% 18993781  1% 75        0% 307167167 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
3108      0% 18598329  1% 3892199   0% 0         0% 742       0% 2         0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
3         0% 34142743  3% 34166131  3% 0         0% 8963430   0% 1449      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
1449      0% 0         0% 54628017  5% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 2:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
53026598   0          0          0          0       

Server nfs v4:
null         compound     
89        0% 53026508 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 9605179   5% 5472897   3% 2633853   1% 
create       delegpurge   delegreturn  getattr      getfh        link         
231276    0% 0         0% 0         0% 50395149 28% 11903036  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 7528324   4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
5477948   3% 0         0% 2760996   1% 15        0% 53025580 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
633       0% 3566001   2% 673466    0% 0         0% 11        0% 0         0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
2         0% 5704253   3% 5709223   3% 0         0% 1526967   0% 292       0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
292       0% 0         0% 10439844  5% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 3:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
63045403   0          0          0          0       

Server nfs v4:
null         compound     
121       0% 63045280 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 11504749  5% 6504139   3% 3119453   1% 
create       delegpurge   delegreturn  getattr      getfh        link         
271128    0% 0         0% 0         0% 59865633 28% 14058385  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 8852565   4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
6521385   3% 0         0% 3365913   1% 15        0% 63043988 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
874       0% 4209822   2% 791702    0% 0         0% 6         0% 0         0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
9         0% 6775367   3% 6792509   3% 0         0% 1811226   0% 409       0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
409       0% 0         0% 12368747  5% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 4:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
817288490   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 817287285 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 164320609  6% 89471711  3% 42448842  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4101436   0% 0         0% 0         0% 778155935 28% 180629867  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 109104313  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
89598740  3% 0         0% 53534516  1% 9727      0% 817288175 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8020      0% 46348481  1% 10773529  0% 0         0% 100880    0% 12342     0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
80        0% 93709338  3% 93712518  3% 0         0% 24303185  0% 3352      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3352      0% 0         0% 127464001  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 5:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
804660319   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 804659114 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161331719  6% 88318366  3% 41571552  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4090384   0% 0         0% 0         0% 764533960 28% 177969216  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 107385644  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88321805  3% 0         0% 54307425  2% 444       0% 804647492 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8353      0% 45980723  1% 10410930  0% 0         0% 88471     0% 440       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
21        0% 92410970  3% 92412629  3% 0         0% 23733174  0% 3688      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3688      0% 0         0% 125268800  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 6:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
795385017   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 795383812 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 158633282  5% 87331357  3% 41400927  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4080179   0% 0         0% 0         0% 756063664 28% 176355823  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 106692513  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
87333398  3% 0         0% 53591273  2% 187       0% 795371861 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8367      0% 45030006  1% 10352133  0% 0         0% 80473     0% 151       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
28        0% 91411943  3% 91413728  3% 0         0% 23629833  0% 3707      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3707      0% 0         0% 124033436  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 7:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
801916264   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 801915059 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161888929  6% 88285947  3% 41212864  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4069479   0% 0         0% 0         0% 762072130 28% 177131560  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 106437411  3% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88288337  3% 0         0% 54779036  2% 191       0% 801903449 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8225      0% 45488312  1% 10259565  0% 0         0% 76243     0% 177       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
4         0% 92355900  3% 92357993  3% 0         0% 23515286  0% 3558      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3558      0% 0         0% 123741908  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 8:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
804732833   1          1          0          0       

Server nfs v4:
null         compound     
1204      0% 804731628 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161428939  6% 88340891  3% 41568432  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4085332   0% 0         0% 0         0% 764396486 28% 177796853  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 107176837  3% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88342790  3% 0         0% 54344886  2% 187       0% 804720008 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8226      0% 46219141  1% 10381361  0% 0         0% 83380     0% 160       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
0         0% 92426602  3% 92428282  3% 0         0% 23736349  0% 3554      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3554      0% 0         0% 125088530  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 9:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
800961003   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 800959798 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161394704  6% 88264733  3% 41642226  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4098314   0% 0         0% 0         0% 761225542 28% 172733291  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 102217363  3% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88272429  3% 0         0% 53937975  2% 467       0% 800948292 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8312      0% 45893437  1% 10409370  0% 0         0% 83127     0% 478       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
35        0% 92369729  3% 92371221  3% 0         0% 23772628  0% 3637      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3637      0% 0         0% 124997490  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 
Roman Newaza
źródło
Jak dużo jest za dużo?
Fred Foo,
ponieważ pytanie dotyczy zgłoszenia błędu jądra, wersja jądra pomogłaby.
Dmitrij Chubarow

Odpowiedzi:

6

procesy opróżniania są odpowiedzialne za zarządzanie zapisem brudnych stron do systemu plików, z którego pochodzą. Nigdy nie powinny zajmować dużo procesora; większość czasu spędzają czekając na dysk (lub sieć dla NFS i tym podobnych). Jeśli widzisz wysokie użycie procesora przez proces opróżniania, może to być błąd jądra - spróbuj uruchomić ponownie, to powinno wyczyścić stan.

bdonlan
źródło
Cześć! Co to są brudne strony? Tak, właśnie zakończyłem awarię wystąpienia, a pozostałe działają normalnie z bankomatu.
Roman Newaza,
Brudne strony to fragmenty plików, które są buforowane w pamięci, zostały zmodyfikowane w pamięci przez jakiś proces, ale nie zostały jeszcze zapisane na dysk. Zwykle ten zapis odbywa się w tle, chyba że ilość brudnych stron jest zbyt wysoka, wolna pamięć jest zbyt niska lub program wyraźnie pyta - gdy zapis odbywa się w tle, zadaniem flusha jest uruchomienie tego zapisu.
bdonlan,
co jeśli uruchomię, echo 3 > /proc/sys/vm/drop_cachesaby opróżnić pamięć podręczną, dentries i i-węzły?
Roman Newaza
@RomanNewaza spowoduje to upuszczenie tylko pamięci podręcznych, które nie są zablokowane lub brudne. flush zwykle nie robi nic z nieczyszczonymi pamięciami podręcznymi, więc wątpię, żeby to miało znaczenie (a to poważnie zaszkodzi wydajności na serwerze, dopóki pamięci podręczne się nie rozgrzeją)
bdonlan
Widzę. Wygląda na to, że problem zniknął po aktualizacji jądra.
Roman Newaza