Yes, obviously using implementation with poll() is much better practice
than select(), but my problem seems to be a little different. The problem
that I face is that I run out of file descriptors after certain number of
calls to loop().
Each call consumes few fd's which seem to be closed properly. But in fact
these fd's are never reused and I can see them piling in
/proc/{pid}/fdinfo. My implementation does not use many fd's, problem is
that they stay open and are not reused. Eventually I run out of fd's and
face to buffer overflow crash.
On Wed, Feb 26, 2020 at 12:59 PM Pino Toscano <ptoscano(a)redhat.com> wrote:
On Wednesday, 26 February 2020 10:43:27 CET Richard W.M. Jones
wrote:
> On Wed, Feb 26, 2020 at 11:21:18AM +0200, Veselin Kozhuharski wrote:
> > Hallo Rich,
> >
> > Here is the fd list and total number just before collectd application
> > crashes. Before that the number of used fd's is constantly increasing.
It
> > looks like a fd leak inside libguestfs to me. I am trying to debug the
fd
> > handling inside the library.
> >
> > root@localhost:~# less /proc/8829/fdinfo/
> > Display all 1035 possibilities? (y or n)
> > 0 1029 129 161 193 224 256 288 319 350 382 413
445
> > 477 508 54 571 602 634 666 698 729 760 792 823
> > 855 887 918 95 981
> > 1 103 13 162 194 225 257 289 32 351 383 414
446
> > 478 509 540 572 603 635 667 699 73 761 793 824
> > 856 888 919 950 982
> > 10 1030 130 163 195 226 258 29 320 352 384 415
447
> > 479 51 541 573 604 636 668 7 730 762 794 825
> > 857 889 92 951 983
> > 100 1031 131 164 196 227 259 290 321 353 385 416
448
> > 48 510 542 574 605 637 669 70 731 763 795 826
> > 858 89 920 952 984
> > 1000 1032 132 165 197 228 26 291 322 354 386 417
449
> > 480 511 543 575 606 638 67 700 732 764 796 827
> > 859 890 921 953 985
> > 1001 1033 133 166 198 229 260 292 323 355 387 418
45
> > 481 512 544 576 607 639 670 701 733 765 797
828
> > 86 891 922 954 986
> > 1002 1034 134 167 199 23 261 293 324 356 388 419
450
> > 482 513 545 577 608 64 671 702 734 766 798 829
> > 860 892 923 955 987
> > 1003 1035 135 168 2 230 262 294 325 357 389 42
451
> > 483 514 546 578 609 640 672 703 735 767 799 83
> > 861 893 924 956 988
> > 1004 104 136 169 20 231 263 295 326 358 39 420
452
> > 484 515 547 579 61 641 673 704 736 768 8 830
> > 862 894 925 957 989
> > 1005 105 137 17 200 232 264 296 327 359 390 421
453
> > 485 516 548 58 610 642 674 705 737 769 80 831
> > 863 895 926 958 99
> > 1006 106 138 170 201 233 265 297 328 36 391 422
454
> > 486 517 549 580 611 643 675 706 738 77 800 832
> > 864 896 927 959 990
> > 1007 107 139 171 202 234 266 298 329 360 392 423
455
> > 487 518 55 581 612 644 676 707 739 770 801 833
> > 865 897 928 96 991
> > 1008 108 14 172 203 235 267 299 33 361 393 424
456
> > 488 519 550 582 613 645 677 708 74 771 802 834
> > 866 898 929 960 992
> > 1009 109 140 173 204 236 268 3 330 362 394 425
457
> > 489 52 551 583 614 646 678 709 740 772 803 835
> > 867 899 93 961 993
> > 101 11 141 174 205 237 269 30 331 363 395 426
458
> > 49 520 552 584 615 647 679 71 741 773 804 836
> > 868 9 930 962 994
> > 1010 110 142 175 206 238 27 300 332 364 396 427
459
> > 490 521 553 585 616 648 68 710 742 774 805 837
> > 869 90 931 963 995
> > 1011 111 143 176 207 239 270 301 333 365 397 428
46
> > 491 522 554 586 617 649 680 711 743 775 806
838
> > 87 900 932 964 996
> > 1012 112 144 177 208 24 271 302 334 366 398 429
460
> > 492 523 555 587 618 65 681 712 744 776 807 839
> > 870 901 933 965 997
> > 1013 113 145 178 209 240 272 303 335 367 399 43
461
> > 493 524 556 588 619 650 682 713 745 777 808 84
> > 871 902 934 966 998
> > 1014 114 146 179 21 241 273 304 336 368 4 430
462
> > 494 525 557 589 62 651 683 714 746 778 809 840
> > 872 903 935 967 999
> > 1015 115 147 18 210 242 274 305 337 369 40 431
463
> > 495 526 558 59 620 652 684 715 747 779 81 841
> > 873 904 936 968
> > 1016 116 148 180 211 243 275 306 338 37 400 432
464
> > 496 527 559 590 621 653 685 716 748 78 810 842
> > 874 905 937 969
> > 1017 117 149 181 212 244 276 307 339 370 401 433
465
> > 497 528 56 591 622 654 686 717 749 780 811 843
> > 875 906 938 97
> > 1018 118 150 182 213 245 277 308 34 371 402 434
466
> > 498 529 560 592 623 655 687 718 75 781 812 844
> > 876 907 939 970
> > 1019 119 151 183 214 246 278 309 340 372 403 435
467
> > 499 53 561 593 624 656 688 719 750 782 813 845
> > 877 908 94 971
> > 102 12 152 184 215 247 279 31 341 373 404 436
468
> > 5 530 562 594 625 657 689 72 751 783 814 846
> > 878 909 940 972
> > 1020 120 153 185 216 248 28 310 342 374 405 437
469
> > 50 531 563 595 626 658 69 720 752 784 815 847
> > 879 91 941 973
> > 1021 121 154 186 217 249 280 311 343 375 406 438
47
> > 500 532 564 596 627 659 690 721 753 785 816
848
> > 88 910 942 974
> > 1022 122 155 187 218 25 281 312 344 376 407 439
470
> > 501 533 565 597 628 66 691 722 754 786 817 849
> > 880 911 943 975
> > 1023 123 156 188 219 250 282 313 345 377 408 44
471
> > 502 534 566 598 629 660 692 723 755 787 818 85
> > 881 912 944 976
> > 1024 124 157 189 22 251 283 314 346 378 409 440
472
> > 503 535 567 599 63 661 693 724 756 788 819 850
> > 882 913 945 977
> > 1025 125 158 19 220 252 284 315 347 379 41 441
473
> > 504 536 568 6 630 662 694 725 757 789 82 851
> > 883 914 946 978
> > 1026 126 159 190 221 253 285 316 348 38 410 442
474
> > 505 537 569 60 631 663 695 726 758 79 820 852
> > 884 915 947 979
> > 1027 127 16 191 222 254 286 317 349 380 411 443
475
> > 506 538 57 600 632 664 696 727 759 790 821 853
> > 885 916 948 98
> > 1028 128 160 192 223 255 287 318 35 381 412 444
476
> > 507 539 570 601 633 665 697 728 76 791 822 854
> > 886 917 949 980
> >
> > Do you suspect any particular handling inside libguestfs?
> > Thanks!
>
> Yes I guess the select() function here needs to be replaced with poll().
>
>
https://github.com/libguestfs/libguestfs/blob/d9b4e3086e11b18dfc5215a7c4c...
Indeed, we ought to. I'll try to get it converted. Also, note for
libguestfs people: select() is used also in the daemon, although that
should not be a problem, since it does not use many libraries, and in
general has few fd's opened at runtime.
Note that collectd needs similar fixes, as I read in its bug tracker:
https://github.com/collectd/collectd/pull/3363
--
Pino Toscano
--
*Veselin Kozhuharski** |* Software Engineer
Direct: +359 2 439 2590 ext. 3912 *|* Mobile: +359 887 412116 |
veselin_k*(a)telco.com
<mzabaruk(a)telco.com>*
*Telco Systems | **www.telco.com <
http://www.telco.com/>*
Follow us: *LinkedIn <
http://www.linkedin.com/company/telco-systems>*
| *Twitter
<
http://twitter.com/TelcoSystems>* | *Facebook
<
https://www.facebook.com/TelcoSystems>* | *YouTube
<
http://www.youtube.com/TelcoSystems>* | *Blog <
http://www.telco.com/blog>*
|