Project

General

Profile

Actions

Action #5497

open

CFHTLS training tests with high resolution photo-z added

Added by Marie Treyer about 5 years ago. Updated about 5 years ago.

Status:
New
Priority:
Normal
Assignee:
Start date:
2019-12-18
Due date:
% Done:

0%

Estimated time:

Description

TRAINING SAMPLE:

Stephane added high resolution photo-z + lower resolution spec-z to the initial SPEC only catalog that Johanna and Jerome previously used for training.
total = ~250k galaxies with i<25.5 + ~15k galaxies kept aside for testing (randomly picked but with smooth N(z) distribution).

here's the mag/zspec and zspec distributions ("zspec" refers to the redshifts used for training even if there are zphot) :

TRAINING TESTS:

model "x" (Jo&Je settings) :
learning rate = 0.0001 to iteration 150000
learning rate = 0.00001 from iteration 150000 to 300000
the model is saved at iteration 300k

Given that the loss function and other parameters for the validation samples seem to reach a minimum far sooner than iteration 300k (see fig below), i tried these 2 things:

model "u":
learning rate = 0.0001 to iteration 80000
learning rate = 0.00001 from iteration 80000 to 200000
models are saved at iterations 100k, 130k, 160k, and 200k

model "v":
learning rate = 0.0001 to iteration 50000
learning rate = 0.00001 from iteration 50000 to 200000
models are saved at iterations 100k, 130k, 160k, and 200k

Here's what's happening. There are 5 cross-validations for each model, the averages are shown in black.
M_square= <(zspec-zcnn_mean)**2.0>
bias = < (zcnn_mean-zspec)/(1+zspec) > as in our paper (the plot is incomplete because i added it to the code half way through the process).
I kept zcnn_mean (pdf weighted mean) because it's faster to compute, although the median gives better results.

INFERENCES:

The performance at 160k and 200k for "u" and "v" are quasi similar, and only slightly better than at 100k.
Here's how the models compare for the test sample that was kept aside (ZCNN is the PDF median here) :


The models aren't significantly different but the "u" (and even "v") trainings run in half the time as "x" (~4h for 1 cross-validation versus ~8h). Also the PDFs are smoother. I wanted to show a random sample of PDFs as well as the distribution of local peaks (above 5%) for "x", "u" and "v" but i seem to have exceeded my quota. Can we change this? Also Jerome is not part of this group and Johanna's address will change soon, we need to do something about that too!


Files

DENSITY.png (28.9 KB) DENSITY.png Marie Treyer, 2019-12-18 14:19
SIGMA_ZMED_TEST.png (34.7 KB) SIGMA_ZMED_TEST.png Marie Treyer, 2019-12-18 14:19
NZ.png (9.95 KB) NZ.png Marie Treyer, 2019-12-18 14:35
TRAINING_PERFS.png (88.2 KB) TRAINING_PERFS.png Marie Treyer, 2019-12-18 14:36
VALIDATION_PERFS.png (215 KB) VALIDATION_PERFS.png Marie Treyer, 2019-12-18 14:36
DELTAZ_ZMED_TEST.png (32.1 KB) DELTAZ_ZMED_TEST.png Marie Treyer, 2019-12-18 17:10
ZSPEC_ZCNN_ilt235_TEST.png (78.2 KB) ZSPEC_ZCNN_ilt235_TEST.png Marie Treyer, 2019-12-18 17:12
ZSPEC_ZCNN_igt235_TEST.png (80.3 KB) ZSPEC_ZCNN_igt235_TEST.png Marie Treyer, 2019-12-18 17:12
PIT_TEST.png (22.5 KB) PIT_TEST.png Marie Treyer, 2019-12-18 17:20
NZ_TEST.png (41.2 KB) NZ_TEST.png Marie Treyer, 2019-12-18 17:29
Actions #1

Updated by Stephane Arnouts about 5 years ago

The models aren't significantly different but the "u" (and even "v") trainings run in half the time as "x" (~4h for 1 cross-validation versus ~8h). Also the PDFs are smoother. I wanted to show a random sample of PDFs as well as the distribution of local peaks (above 5%) for "x", "u" and "v" but i seem to have exceeded my quota. Can we change this? Also Jerome is not part of this group and Johanna's address will change soon, we need to do something about that too!

Great ! results appear quite similar between the 3 versions. v model leads also larger PDF with a better PIT at the end. In the stats there is also a mix of DEEP and WIDE images I guess, which should also be distinguished

J'ai ajouté Jerome dans les membres du wiki !
Actions #2

Updated by Stephane Arnouts about 5 years ago

TRAINING TESTS:

model "x" (Jo&Je settings) :
learning rate = 0.0001 to iteration 150000
learning rate = 0.00001 from iteration 150000 to 300000
the model is saved at iteration 300k

Given that the loss function and other parameters for the validation samples seem to reach a minimum far sooner than iteration 300k (see fig below), i tried these 2 things:

model "u":
learning rate = 0.0001 to iteration 80000
learning rate = 0.00001 from iteration 80000 to 200000
models are saved at iterations 100k, 130k, 160k, and 200k

model "v":
learning rate = 0.0001 to iteration 50000
learning rate = 0.00001 from iteration 50000 to 200000
models are saved at iterations 100k, 130k, 160k, and 200k

Here's what's happening. There are 5 cross-validations for each model, the averages are shown in black.
M_square= <(zspec-zcnn_mean)**2.0>

It seems than the x-model from J&J miss the minimum before 150k and then degrades (in the loss and M^2) even with a smaller learning rate, in contrast to the u and v model. But the weird thing is at the end the Sigma is still better for the x-model. Is this normal ?
When changing the learning rate there is a boost in the loss and sigma. can we change the learning rate one more time to see if another gain appear after 100-200k iteration ?
Actions

Also available in: Atom PDF