nerf_factory源码笔记(二)
本文最后更新于139 天前,其中的信息可能已经过时,如有错误请发送邮件到chengyulong@csu.edu.cn

nerf_factory源码笔记(二)

同样nerf_factory在模型选择上也提供了许多接口,如nerfmipnerfmipnerf360plenoxel等等,如下图所示:

b901629d88c52fbf3cea7d49382b42b3

首先谈到train,该代码用的是pytorch_lighting,是一个高度结构化的代码,首先说一下它其中的训练逻辑,加快大家阅读源码的速度和效率:

A LightningModule organizes your PyTorch code into 6 sections:

  • Initialization (__init__ and setup()).
  • Train Loop (training_step())
  • Validation Loop (validation_step())
  • Test Loop (test_step())
  • Prediction Loop (predict_step())
  • Optimizers and LR Schedulers (configure_optimizers())

这是官方给的解释,意思就是trainvaltest和初始设置要做的是分别在对应的函数里写好就行了,他封装好的框架会对其进行调用,再其中train的具体训练逻辑是

# put model in train mode and enable gradient calculation
model.train()
torch.set_grad_enabled(True)

for batch_idx, batch in enumerate(train_dataloader):
    loss = training_step(batch, batch_idx)

    # clear gradients
    optimizer.zero_grad()

    # backward
    loss.backward()

    # update parameters
    optimizer.step()

因此我们只需要去改写对应的函数就行。

其中我们先讲一下nerf模型的代码,首先先看train_step中的函数

    def training_step(self, batch, batch_idx):

        rendered_results = self.model(
            batch, self.randomized, self.white_bkgd, self.near, self.far
        )
        rgb_coarse = rendered_results[0][0]
        rgb_fine = rendered_results[1][0]
        target = batch["target"]

        loss0 = helper.img2mse(rgb_coarse, target)
        loss1 = helper.img2mse(rgb_fine, target)
        loss = loss1 + loss0

        psnr0 = helper.mse2psnr(loss0)
        psnr1 = helper.mse2psnr(loss1)

        self.log("train/psnr1", psnr1, on_step=True, prog_bar=True, logger=True)
        self.log("train/psnr0", psnr0, on_step=True, prog_bar=True, logger=True)
        self.log("train/loss", loss, on_step=True)

        return loss

重点就是self.model函数,是主要的计算过程,其余都是损失函数的计算,如psnr或者mse是常见的损失函数

def img2mse(x, y):
    return torch.mean((x - y) ** 2)
def mse2psnr(x):
    return -10.0 * torch.log(x) / np.log(10)

这里和论文的一致,
$$
\mathcal L=\sum_{r\in R}[||C^t_c(r)-C(r)||+||C_f^t(r)-C(r)||^2_2]
$$
进入NeRF函数中

模型结构:

根据论文所述,模型分为coarse和fine模型

abc4fecf4dc11da8bfc1987272812db3

class NeRF(nn.Module):
    def __init__(
        ...
                 ):
        ...
        super(NeRF, self).__init__()

        self.rgb_activation = nn.Sigmoid()
        self.sigma_activation = nn.ReLU()
        self.coarse_mlp = NeRFMLP(min_deg_point, max_deg_point, deg_view)
        self.fine_mlp = NeRFMLP(min_deg_point, max_deg_point, deg_view)

进入coarse_mlp模型内

class NeRFMLP(nn.Module):
    def __init__(
        self,
        min_deg_point,
        max_deg_point,
        deg_view,
        netdepth: int = 8,
        netwidth: int = 256,
        netdepth_condition: int = 1,
        netwidth_condition: int = 128,
        skip_layer: int = 4,
        input_ch: int = 3,
        input_ch_view: int = 3,
        num_rgb_channels: int = 3,
        num_density_channels: int = 1,
    ):
        for name, value in vars().items():
            if name not in ["self", "__class__"]:
                setattr(self, name, value)

        super(NeRFMLP, self).__init__()

        self.net_activation = nn.ReLU()
        pos_size = ((max_deg_point - min_deg_point) * 2 + 1) * input_ch  # 位置点的输入维度(与之后提到的位置编码相对应)
        view_pos_size = (deg_view * 2 + 1) * input_ch_view # 视角的输入维度(与之后提到的位置编码相对应)

        init_layer = nn.Linear(pos_size, netwidth)  # 初始输入层[63,256]
        init.xavier_uniform_(init_layer.weight)  # 初始化权重
        pts_linear = [init_layer]

        for idx in range(netdepth - 1):
            if idx % skip_layer == 0 and idx > 0:
                module = nn.Linear(netwidth + pos_size, netwidth)
            else:
                module = nn.Linear(netwidth, netwidth)
            init.xavier_uniform_(module.weight)
            pts_linear.append(module)

        self.pts_linears = nn.ModuleList(pts_linear)  # 组合在一起视为整体

        views_linear = [nn.Linear(netwidth + view_pos_size, netwidth_condition)]
        for idx in range(netdepth_condition - 1):
            layer = nn.Linear(netwidth_condition, netwidth_condition)
            init.xavier_uniform_(layer.weight)
            views_linear.append(layer)

        self.views_linear = nn.ModuleList(views_linear)  # 组合在一起视为整体

        self.bottleneck_layer = nn.Linear(netwidth, netwidth)
        self.density_layer = nn.Linear(netwidth, num_density_channels)  # 生成σ密度
        self.rgb_layer = nn.Linear(netwidth_condition, num_rgb_channels)  # 生成RGB

        init.xavier_uniform_(self.bottleneck_layer.weight)
        init.xavier_uniform_(self.density_layer.weight)
        init.xavier_uniform_(self.rgb_layer.weight)

该代码和下面的模型一一对应

30844865c3df6744f47a334ac74d3f2c

最后通过两个相同的网络结构,前一个粗网络就够得到的值反推进行优化采样策略,之后再进行优化后采样的在放入精细网络结构。最后得出最后的结果。

暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇