Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Segentation fault with Eigen library and GMP rationals
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
amandini
n00b
n00b


Joined: 01 Jul 2020
Posts: 23

PostPosted: Mon Nov 18, 2024 6:25 am    Post subject: Segentation fault with Eigen library and GMP rationals Reply with quote

Hello,

I am doing some work using the Eigen C++ template library, with large sparse matrices, say 200,000 x 200,000, over the mpq_class provided by the GNU Multiple Precision (GMP) library.

The first problem is that code which compiles without any warnings on the godbolt compiler, a small example of which is given at https://godbolt.org/z/j66jdK7n6, produces the following warnings when compiled on my gentoo system

Code:

#g++ -std=c++20 -O2 -march=native -lgmp -lgmpxx -Iboost -Wall -Wpedantic -Wextra example.cpp  -o example.o
In file included from /usr/include/eigen3/Eigen/SparseCore:40,
                 from example.cpp:6:
/usr/include/eigen3/Eigen/src/SparseCore/AmbiVector.h: In instantiation of ‘void Eigen::internal::AmbiVector<_Scalar, _StorageIndex>::reallocateSparse() [with _Scalar = __gmp_expr<__mpq_struct [1], __mpq_struct [1]>; _StorageIndex = int]’:
/usr/include/eigen3/Eigen/src/SparseCore/AmbiVector.h:238:11:   required from ‘_Scalar& Eigen::internal::AmbiVector<_Scalar, _StorageIndex>::coeffRef(Eigen::Index) [with _Scalar = __gmp_expr<__mpq_struct [1], __mpq_struct [1]>; _StorageIndex = int; Eigen::Index = long int]’
/usr/include/eigen3/Eigen/src/SparseCore/TriangularSolver.h:233:28:   required from ‘static void Eigen::internal::sparse_solve_triangular_sparse_selector<Lhs, Rhs, Mode, UpLo, 0>::run(const Lhs&, Rhs&) [with Lhs = const Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]> >; Rhs = Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]> >; int Mode = 6; int UpLo = 2]’
/usr/include/eigen3/Eigen/src/SparseCore/TriangularSolver.h:306:93:   required from ‘void Eigen::TriangularViewImpl<MatrixType, Mode, Eigen::Sparse>::solveInPlace(Eigen::SparseMatrixBase<OtherDerived>&) const [with OtherDerived = Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]> >; MatrixType = const Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]> >; unsigned int Mode = 6]’
example.cpp:80:61:   required from here
/usr/include/eigen3/Eigen/src/SparseCore/AmbiVector.h:97:18: warning: ‘void* memcpy(void*, const void*, size_t)’ writing to an object of type ‘Eigen::internal::AmbiVector<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int>::Scalar’ {aka ‘class __gmp_expr<__mpq_struct [1], __mpq_struct [1]>’} with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Wclass-memaccess]
   97 |       std::memcpy(newBuffer,  m_buffer,  copyElements * sizeof(ListEl));
      |       ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from example.cpp:4:
/usr/include/gmpxx.h:1763:7: note: ‘Eigen::internal::AmbiVector<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int>::Scalar’ {aka ‘class __gmp_expr<__mpq_struct [1], __mpq_struct [1]>’} declared here
 1763 | class __gmp_expr<mpq_t, mpq_t>
      |       ^~~~~~~~~~~~~~~~~~~~~~~~


Eigen developers informed me that they could not reproduce these warnings on their machines. Nor are the warnings reproducible on the compiler explorer https://godbolt.org/z/j66jdK7n6.

Second and more pressing issue is that the process of constructing large sparse matrices (175,175 x 175,175) over mpq_class in Eigen causes the following segmentation fault:
Code:
Thread 1 "main.o" received signal SIGSEGV, Segmentation fault.
0x00007ffff79d525e in free () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff79d525e in free () from /lib64/libc.so.6
#1  0x00007ffff7edee80 in __gmpq_clear () from /usr/lib64/libgmp.so.10
#2  0x000055555555ff4d in __gmp_expr<__mpq_struct [1], __mpq_struct [1]>::~__gmp_expr (this=0x7fffffffd510, __in_chrg=<optimized out>)
    at /usr/include/gmpxx.h:1855
#3  Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 1, int>::insertBackUncompressed (col=<optimized out>, row=<optimized out>, this=0x7fffffffd4c0) at /usr/include/eigen3/Eigen/src/SparseCore/SparseMatrix.h:904
#4  Eigen::internal::set_from_triplets<__gnu_cxx::__normal_iterator<Eigen::Triplet<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int> const*, std::vector<Eigen::Triplet<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int>, std::allocator<Eigen::Triplet<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int> > > >, Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 0, int>, Eigen::internal::scalar_sum_op<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, __gmp_expr<__mpq_struct [1], __mpq_struct [1]> > > (begin=..., end=..., mat=..., dup_func=...) at /usr/include/eigen3/Eigen/src/SparseCore/SparseMatrix.h:1056
#5  0x000055555556cf22 in Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 0, int>::setFromTriplets<__gnu_cxx::__normal_iterator<Eigen::Triplet<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int> const*, std::vector<Eigen::Triplet<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int>, std::allocator<Eigen::Triplet<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, int> > > > > (end=..., begin=..., this=0x7fffffffdb78) at /usr/include/eigen3/Eigen/src/Core/functors/BinaryFunctors.h:36
#6  p_basis<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 0, int> >::basis_builder (this=this@entry=0x7fffffffdb38, src=..., seminormal_form=..., p=p@entry=3) at /home/user_name/path_to_file/p_basis.h:165
#7  0x0000555555570c69 in p_basis<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 0, int > >::p_basis (this=0x7fffffffdb38, src=..., seminormal_form=..., p=3) at /home/user_name/path_to_file/p_basis.h:57
#8  0x00005555555718c5 in basis_data<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 0, int> >::basis_data (p=3, src=..., this=0x7fffffffdaf8) at /home/user_name/path_to_file/basis_data.h:61
#9  mform<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, Eigen::SparseMatrix<__gmp_expr<__mpq_struct [1], __mpq_struct [1]>, 0, int> >::mform (this=0x7fffffffdaf0, src=..., path="/home/user_name/output.txt", p=3) at /home/user_name/path_to_file/mform.h:36
#10 0x0000555555559995 in main () at /usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13/bits/basic_string.tcc:242
(gdb)


The code that produced the segmentation fault above, was
Code:
typedef Eigen::Triplet<mpq_class> qtriplet;
std::vector<qtriplet> triplets{};

and line 165 of the file p_basis.h referred to in the gdb backtrace aboveis
Code:
m_basis.setFromTriplets(std::cbegin(triplets), std::cend(triplets));

Would anyone at the Gentoo Science Project (Eigen) or the Gentoo Toolchain Project (GMP) be willing or able to help look into these issues?

Any other suggestions that may help get to the bottom of this issue would be gratefully received.

[edited for typos]


Last edited by amandini on Mon Nov 18, 2024 8:25 am; edited 1 time in total
Back to top
View user's profile Send private message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 1972

PostPosted: Mon Nov 18, 2024 6:37 am    Post subject: Reply with quote

I see the warning locally as well but not on godbolt. Not sure why yet.

What does -march=native expand to on your system? You can check by running resolve-march-native from app-misc/resolve-march-native. I do see it without but it may be useful to know nonetheless.

Could you give me a testcase which segfaults? Can look more then.
Back to top
View user's profile Send private message
amandini
n00b
n00b


Joined: 01 Jul 2020
Posts: 23

PostPosted: Mon Nov 18, 2024 7:05 am    Post subject: Reply with quote

sam_ wrote:
What does -march=native expand to on your system? You can check by running resolve-march-native from app-misc/resolve-march-native.

I see the warning locally as well but not on godbolt. Not sure why yet.

Could you give me a testcase which segfaults?


Thank you for your reply. I run these calculations on two workstations, one is Intel, on which -march=native expands to -march=tigerlake and the other is AMD, on which it expands to -march=zenv2 (it is Ryzen Threadripper Pro 3955WX)

The compiler warnings occur identically on both the AMD and the Intel machine.

The example on godbolt was only contrived to illustrate the compiler warnings. As it is, it took many lines of code and over 500GB of ram on the AMD machine to produce the segmentation fault, so it may not be practical to share this specific example.

I have not tried to reproduce the segmentation fault on the intel machine which has less memory.

However, it may be possible to come up with a contrived and simpler example that reproduces the segmentation fault. Unfortunately may take a bit of time to get such a segmentation fault example working - my apologies in advance.

Should I go ahead and try to make such an example?
Back to top
View user's profile Send private message
bstaletic
Guru
Guru


Joined: 05 Apr 2014
Posts: 374

PostPosted: Mon Nov 18, 2024 8:02 am    Post subject: Reply with quote

I'm not familiar with GMP, but that operator< looks very wrong - sometimes it tells you which of the two inputs is smaller and sometimes it does the opposite.

Code:

x   y | result 
---------------
0   0 | false
0   y | true 
x   0 | false
x  2x | false
2x  x | true 


The last two return true if x is greater than y.
The first three return true if y is greater than x.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum