I encountered the following problem (I give more details of the problem at the end of the post) and I am trying to figure out the best way of performing a null hypothesis testing. I looked for similar questions (like this) but it does not fit exactly my problem. I have a random vector $X = (X_1,...,X_N)$ of $N$ random binary variables, not necessarily independent and non identically distributed. These $N$ variables are divided into two subsets: $A$ with $N_A$ random variables and $B$ with $N_B$ variables ($N_A + N_B = N$), so I can also write $X = (X_A,X_B)$. I know the marginal distribution of each of the binary variables, as well as the first and second moments of the random vector. Now I consider another random vector $Y = (Y_1,...,Y_N) = (Y_A, Y_B)$, from which I can only sample in two steps: first sample $Y_A$ (obtaining some string $(a_1,...,a_{N_A})$), and then sample $(Y_B | Y_A = (a_1,...,a_{N_A}))$. The null hypothesis is that $Y$ follows the same distribution as $X$. The problem that arises here is that the set of possible outcomes for $A$ is too large, which means that the probability to obtain the same $a= (a_1, ..., a_{N_A})$ is negligible. Thus, the distribution of the random variables in subset $B$ changes in each iteration. Since I cannot repeat the sampling under identical conditions I cannot use the usual central limit theorem to approximate the experimental mean by a Gaussian and perform typical Gaussian hypothesis tests. You can imagine this as having $N$ biased coins, each bias being different, and the coins may not be independent. First I throw $N_A$ of the coins, which conditions the possible outcomes of the second set $B$. How can I test my null hypothesis under these restrictions? More details of the problem: I am dealing with a problem in quantum mechanics, having a state of $N$ spins that might be entangled (thus non independent variables). The data corresponds to measuring part of the system first (subsystem $A$), thus collapsing the whole state and conditioning the possible outcomes of the rest of the system (subsystem $B$). Because the set of possible outcomes for subsystem $A$ is very large and because when I measure I destroy the state, sampling two times subsystem $A$ and obtain the same result is highly unlikely. Thank you very much in advance! Any idea or suggestion is highly appreciated! Login To add answer/comment