Hi @Masoum, the following are definitions from the robot_localization package

The **estimate covariance**, commonly
denoted **P or R**, defines the error in the
current state estimate. The parameter
allows users to set the initial value
for the matrix, which will affect how
quickly the filter converges. For
example, if users set the value at
position [0,0] to a very small value,
e.g., 1e-12, and then attempt to fuse
measurements of X position with a high
variance value for X, then the filter
will be very slow to “trust” those
measurements, and the time required
for convergence will increase. Again,
users should take care with this
parameter. When only fusing velocity
data (e.g., no absolute pose
information), users will likely not
want to set the initial covariance
values for the absolute pose variables
to large numbers. This is because
those errors are going to grow without
bound (owing to the lack of absolute
pose measurements to reduce the
error), and starting them with large
values will not benefit the state
estimate.

The **process noise covariance**, commonly
denoted Q, is used to model
uncertainty in the prediction stage of
the filtering algorithms. It can be
difficult to tune, and has been
exposed as a parameter for easier
customization. This parameter can be
left alone, but you will achieve
superior results by tuning it. In
general, the larger the value for **Q**
relative to the variance for a given
variable in an input message, the
faster the filter will converge to the
value in the measurement.

There are many methods to initialize the covariances. As stated in the robot_localization package, one approach is to use trial and error, others call it engineer's intuition. You can start by working with very small values and measure the results based on time to converge. For the process noise covariance, you can tune by starting 'high' and lowering until you are satisfied. The disadvantage of this method is that is time consuming and there are many combinations possible. You will be optimizing one parameter at the time.

If you are looking for a systematic approach, you can use the proposed method in this paper:
https://arxiv.org/ftp/arxiv/papers/17.... The paper proposes for example to use the ratio of Q/R looking for smallest SME (square mean error) to find the best performance, rather than optimizing one parameter at the time. But it's not the only proposed approach in this paper, there are more. I'd highly recommend reading it.

Another approach I have seen is to use PSO (Particle Swarm Optimization) for EKF tuning with great results: https://www.naun.org/main/NAUN/circui.... This is a more consistent approach to finding parameters, however is more complex to implement.